How to Read a Research Paper

If you work in a scientific field, you should try to build a deep and unbiased understanding of that field. This not only educates you in the best possible way but also helps you envision the opportunities in your space.

A research paper is often the culmination of a wide range of deep and authentic practices surrounding a topic. When writing a research paper, the author thinks critically about the problem, performs rigorous research, evaluates their processes and sources, organizes their thoughts, and then writes. These genuinely-executed practices make for a good research paper.

If you’re struggling to build a habit of reading papers (like I am) on a regular basis, I’ve tried to break down the whole process. I've talked to researchers in the field, read a bunch of papers and blogs from distinguished researchers, and jotted down some techniques that you can follow.

Let’s start off by understanding what a research paper is and what it is NOT!

What is a Research Paper?

A research paper is a dense and detailed manuscript that compiles a thorough understanding of a problem or topic. It offers a proposed solution and further research along with the conditions under which it was deduced and carried out, the efficacy of the solution and the research performed, and potential loopholes in the study.

A research paper is written not only to provide an exceptional learning opportunity but also to pave the way for further advancements in the field. These papers help other scholars germinate the thought seed that can either lead to a new world of ideas or an innovative method of solving a longstanding problem.

What Research Papers are NOT

There is a common notion that a research paper is a well-informed summary of a problem or topic written by means of other sources.

But you shouldn't mistake it for a book or an opinionated account of an individual’s interpretation of a particular topic.

Why Should You Read Research Papers?

What I find fascinating about reading a good research paper is that you can draw on a profound study of a topic and engage with the community on a new perspective to understand what can be achieved in and around that topic.

I work at the intersection of instructional design and data science. Learning is part of my day-to-day responsibilities. If the source of my education is flawed or inefficient, I’d fail at my job in the long term. This applies to many other jobs in Science with a special focus on research.

There are three important reasons to read a research paper:

Knowledge — Understanding the problem from the eyes of someone who has probably spent years solving it and has taken care of all the edge cases that you might not think of at the beginning.
Exploration — Whether you have a pinpointed agenda or not, there is a very high chance that you will stumble upon an edge case or a shortcoming that is worth following up. With persistent efforts over a considerable amount of time, you can learn to use that knowledge to make a living.
Research and review — One of the main reasons for writing a research paper is to further the development in the field. Researchers read papers to review them for conferences or to do a literature survey of a new field. For example, Yann LeCun’s paper on integrating domain constraints into backpropagation set the foundation of modern computer vision back in 1989. After decades of research and development work, we have come so far that we're now perfecting problems like object detection and optimizing autonomous vehicles.

Not only that, with the help of the internet, you can extrapolate all of these reasons or benefits onto multiple business models. It can be an innovative state-of-the-art product, an efficient service model, a content creator, or a dream job where you are solving problems that matter to you.

Goals for Reading a Research Paper — What Should You Read About?

The first thing to do is to figure out your motivation for reading the paper. There are two main scenarios that might lead you to read a paper:

Scenario 1 — You have a well-defined agenda/goal and you are deeply invested in a particular field. For example, you’re an NLP practitioner and you want to learn how GPT-4 has given us a breakthrough in NLP. This is always a nice scenario to be in as it offers clarity.
Scenario 2 — You want to keep abreast of the developments in a host of areas, say how a new deep learning architecture has helped us solve a 50-year old biological problem of understanding protein structures. This is often the case for beginners or for people who consume their daily dose of news from research papers (yes, they exist!).

If you’re an inquisitive beginner with no starting point in mind, start with scenario 2. Shortlist a few topics you want to read about until you find an area that you find intriguing. This will eventually lead you to scenario 1.

ML Reproducibility Challenge

In addition to these generic goals, if you need an end goal for your habit-building exercise of reading research papers, you should check out the ML reproducibility challenge.

You’ll find top-class papers from world-class conferences that are worth diving deep into and reproducing the results.

They conduct this challenge twice a year and they have one coming up in Spring 2021. You should study the past three versions of the challenge, and I’ll write a detailed post on what to expect, how to prepare, and so on.

Now you must be wondering – how can you find the right paper to read?

How to Find the Right Paper to Read

In order to get some ideas around this, I reached out to my friend, Anurag Ghosh who is a researcher at Microsoft. Anurag has been working at the crossover of computer vision, machine learning, and systems engineering.

Screenshot-2021-03-04-at-12.08.31-AM — https://anuragxel.github.io/

Here are a few of his tips for getting started:

Always pick an area you're interested in.
Read a few good books or detailed blog posts on that topic and start diving deep by reading the papers referenced in those resources.
Look for seminal papers around that topic. These are papers that report a major breakthrough in the field and offer a new method perspective with a huge potential for subsequent research in that field. Check out papers from the morning paper or CVF - test of time award/Helmholtz prize (if you're interested in computer vision).
Check out books like Computer Vision: Algorithms and Applications by Richard Szeliski and look for the papers referenced there.
Have and build a sense of community. Find people who share similar interests, and join groups/subreddits/discord channels where such activities are promoted.

In addition to these invaluable tips, there are a number of web applications that I’ve shortlisted that help me narrow my search for the right papers to read:

r/MachineLearning — there are many researchers, practitioners, and engineers who share their work along with the papers they've found useful in achieving those results.

Screenshot-2021-03-01-at-10.55.53-PM — https://www.reddit.com/r/MachineLearning/

Arxiv Sanity Preserver — built by Andrej Karpathy to accelerate research. It is a repository of 142,846 papers from computer science, machine learning, systems, AI, Stats, CV, and so on. It also offers a bunch of filters, powerful search functionality, and a discussion forum to make for a super useful research platform.

Google Research — the research teams at Google are working on problems that have an impact on our everyday lives. They share their publications for individuals and teams to learn from, contribute to, and expedite research. They also have a Google AI blog that you can check out.

After you have stocked your to-read list, then comes the process of reading these papers. Remember that NOT every paper is useful to read and we need a mechanism that can help us quickly screen papers that are worth reading.

To tackle this challenge, you can use this Three-Pass Approach by S. Keshav. This approach proposes that you read the paper in three passes instead of starting from the beginning and diving in deep until the end.

The three pass approach

The first pass — is a quick scan to capture a high-level view of the paper. Read the title, abstract, and introduction carefully followed by the headings of the sections and subsections and lastly the conclusion. It should take you no more than 5–10 mins to figure out if you want to move to the second pass.
The second pass — is a more focused read without checking for the technical proofs. You take down all the crucial notes, underline the key points in the margins. Carefully study the figures, diagrams, and illustrations. Review the graphs, mark relevant unread references for further reading. This helps you understand the background of the paper.
The third pass — reaching this pass denotes that you’ve found a paper that you want to deeply understand or review. The key to the third pass is to reproduce the results of the paper. Check it for all the assumptions and jot down all the variations in your re-implementation and the original results. Make a note of all the ideas for future analysis. It should take 5–6 hours for beginners and 1–2 hours for experienced readers.

Tools and Software to Keep Track of Your Pipeline of Papers

If you’re sincere about reading research papers, your list of papers will soon grow into an overwhelming stack that is hard to keep track of. Fortunately, we have software that can help us set up a mechanism to manage our research.

Here are a bunch of them that you can use:

Mendeley [not free] — you can add papers directly to your library from your browser, import documents, generate references and citations, collaborate with fellow researchers, and access your library from anywhere. This is mostly used by experienced researchers.

Screenshot-2021-03-02-at-1.28.19-AM — https://www.mendeley.com/?interaction_required=true

Zotero [free & open source] — Along the same lines as Mendeley but free of cost. You can make use of all the features but with limited storage space.

Screenshot-2021-03-02-at-1.42.28-AM — https://www.zotero.org/

Notion — this is great if you are just starting out and want to use something lightweight with the option to organize your papers, jot down notes, and manage everything in one workspace. It might not stand anywhere in comparison with the above tools but I personally feel comfortable using Notion and I have created this board to keep track of my progress for now that you can duplicate:

⚠️ Symptoms of Reading a Research Paper

Reading a research paper can turn out to be frustrating, challenging, and time-consuming especially when you’re a beginner. You might face the following common symptoms:

You might start feeling dumb for not understanding a thing a paper says.
Finding yourself pushing too hard to understand the math behind those proofs.
Beating your head against the wall to wrap it around the number of acronyms used in the paper. Just kidding, you’ll have to look up those acronyms every now and then.
Being stuck on one paragraph for more than an hour.

Here’s a complete list of emotions that you might undergo as explained by Adam Ruben in this article.

Key Takeaways

We should be all set to dive right in. Here’s a quick summary of what we have covered here:

A research paper is an in-depth study that offers an detailed explanation of a topic or problem along with the research process, proofs, explained results, and ideas for future work.
Read research papers to develop a deep understanding of a topic/problem. Then you can either review papers as part of being a researcher, explore the domain and the kind of problems to build a solution or startup around it, or you can simply read them to keep abreast of the developments in your domain of interest.
If you’re a beginner, start with exploration to soon find your path to goal-oriented research.
In order to find good papers to read, you can use websites like arxiv-sanity, google research, and subreddits like r/MachineLearning.
Reading approach — Use the 3-pass method to find a paper.
Keep track of your research, notes, developments by using tools like Zotero/Notion.
This can get overwhelming in no time. Make sure you start off easy and increment your load progressively.

Remember: Art is not a single method or step done over a weekend but a process of accomplishing remarkable results over time.

You can also watch the video on this topic on my YouTube channel:

Feel free to respond to this blog or comment on the video if you have some tips, questions, or thoughts!

If this tutorial was helpful, you should check out my data science and machine learning courses on Wiplane Academy. They are comprehensive yet compact and helps you build a solid foundation of work to showcase.