Skip to main content

9 posts tagged with "Computer Vision"

View All Tags

· 12 min read
Nischay Dhankhar

Effdet

Introduction

EfficientDet model series was introduced by Google Brain Team in 2020 which turns out to be outperforming almost every detection model of similar size in the majority of the tasks. It utilizes several optimizations. Also, many tweaks in the architecture backbone were introduced including the use of a Bi-directional Feature Pyramid Network [BiFPN] and scaling methods which resulted in the better fusion of features.

· 9 min read
Nischay Dhankhar

Finding Popularity of Pets

Introduction

In this competition, we will be predicting engagement with a shelter pet's profile based on the photograph for that profile. Along with the image of each pet, we are also provided Metadata for them that consists of different features like focus, eyes, etc. We aim to somehow utilize both images as well as tabular data in the best possible way to minimize the error rate.

· 5 min read
Vishnu Subramanian

Resnet Strikes back

Introduction

Most of the modern architectures proposed in Computer Vision use ResNet architecture to benchmark their results. These novel architectures train with improved training strategy, data augmentation, and optimizers. In the ResNet Strikes Back paper, the authors retrain a ResNet50 model with modern best practices and achieve a top-1 accuracy of 80.4% from the standard 75.3% top-1 accuracy.

· 7 min read
Poonam Ligade

understanding resnet architecture

Why it is important to understand ResNet?

ResNets are the backbone behind most of the modern computer vision architectures. For a lot of common problems in computer vision, the go-to architecture is resnet 34. Most of the modern CNN architectures like ResNext, DenseNet are different variants to original resnet architecture. In different subfields of computer vision like object detection, image segmentation resnet plays an important role as a pre-trained backbone.

· 8 min read
Vishnu Subramanian

teddy with a mask

In the last several weeks I saw a lot of posts showcasing demos/products on how to use AI algorithms for recognizing people who are not wearing masks. In this post, I will take you through a very simple approach to show how can you build one yourself and also end by asking few questions that can help in building a product that can be used for production. We will be using PyTorch for this task, but the steps would remain almost the same if you are trying to achieve it using another framework like TensorFlow, Keras, or MxNet.

To build any AI algorithm, the most common approach is to

  1. Prepare a labeled dataset.
  2. Choose an architecture that suits your needs. Preferably pre-trained based on your use-case.
  3. Train the model and test the model.

· 12 min read
Vishnu Subramanian

View on Github

Transfer learning has become a key component of modern deep learning, both in the fields of CV and NLP. In this post, we will look at how to apply transfer learning for a Computer Vision classification problem. Along the way I will be showing you how to tweak your neural network to achieve better results. We are using PyTorch for this, but the techniques that we learn can be applied across other frameworks too.

· 12 min read
Vishnu Subramanian

View on Github  

Building image segmentation model using fastai and PyTorch | Part-2

In the first part, we looked at how we built a data pipeline required for creating data loaders using fastai. In this part, we will create different segmentation models that help us rank in the top 4% of the Kaggle Leader board.

clay modelling

It is often said the modeling is the easy part, but it is rarely true. Building models for any kind of problems have too many levers that can give you great results or bad results. Anyone who has participated in a Kaggle competition or built a real-world deep learning model can tell us more stories of how a particular technique that worked for a certain problem did not work on a new dataset. We are going to take a step by step approach to build a model and improve it.

· 13 min read
Vishnu Subramanian

When working on Deep learning projects getting the right pipeline starting from data processing to creating predictions is a nontrivial task. In the last few years, several frameworks were built on top of popular deep learning frameworks like TensorFlow and PyTorch to accelerate building these pipelines. In this blog, we will explore how we can use one of the popular frameworks fastai2 which is currently in the early release but the high-level API is stable for us to use.

pot making

About fastai2

One of the best places to know about the library is to go through the paper published by the authors of the library. Find the abstract from the paper

fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes: a new type dispatch system for Python along with a semantic type hierarchy for tensors; a GPU-optimized computer vision library which can be extended in pure Python; an optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4-5 lines of code; a novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training; a new data block API; and much more. We have used this library to successfully create a complete deep learning course, which we were able to write more quickly than using previous approaches, and the code was more clear. The library is already in wide use in research, industry, and teaching. NB: This paper covers fastai v2, which is currently in pre-release at dev.fast.ai