A summary of the common downstream tasks used to evaluate video representation learning for long, instructional videos.

What is Representation Learning?

Representation learning is an area of research that focuses on how to learn compact, numerical representations for different sources of signal. These signals are most often video, text, audio, and image. The goal of this research is to use these representations for other tasks, such as querying for information. …


This story is a summary of the information you need to understand Covid-19 and how it affects you. It is accumulated from multiple sources as a means of fact checking.

Summary

  • Wear…


One of the distinctive differences between information in a single image and information in a video is the temporal element. This has led to improvements of deep learning model architectures to incorporate 3D processing in order to additionally process temporal information. …


This article will describe some of the state-of-the-art methods in depth predictions in image sequences captured by vehicles that help in the development of new autonomous driving models without the use of extra cameras or sensors.

As mentioned in my previous article “How does Autonomous Driving Work? An Intro into SLAM”, there are many sensors that are used to capture information while a vehicle is driving. The variety of measurements captured include velocity, position, depth, thermal and more. These measurements are fed into a feedback system…


This paper is a review of research in quantum image processing (QIP), storage, and retrieval. It discusses current issues with silicon based computing on processing big data for machine learning tasks such as image recognition and how quantum computation can address these challenges. First this paper will introduce the challenges…


Authors from Google extend prior research using state of the art convolutional approaches to handle objects in images of varying scale [1], beating state-of-the-art models on semantic-segmentation benchmarks.

From Chen, L.-C., Papandreou, G., Schroff, F., & Adam, H., 2017 [1]

Introduction

One of the challenges in segmenting objects in images using deep convolutional neural networks (DCNNs) is that as the input feature map…


SLAM is the process where a robot/vehicle builds a global map of their current environment and uses this map to navigate or deduce its location at any point in time [1–3].

Use of SLAM is commonly found in autonomous navigation, especially to assist navigation in areas global positioning systems (GPS) fail or previously unseen areas. In this article, we will refer to the robot or vehicle as an ‘entity’. …


Introduction

Cyber attacks are on the rise, I do not need to provide much proof of that, as it is in the news almost every day! There are cyber security vendors that do their best to protect organizations’ machines, but there is always gaps that result in the need for human…


Introduction

A regularizer is commonly used in machine learning to constrain a model’s capacity to cerain bounds either based on a statistical norm or on prior hypotheses. This adds preference for one solution over another in the model’s hypothesis space, or the set of functions that the learning algorithm is allowed…


Graph neural networks (GNNs) have emerged as an interesting application to a variety of problems. The most pronounced is in the field of chemistry and molecular biology. An example of the impact in this field is DeepChem, a pythonic library that makes use of GNNs. …

Madeline Schiappa

PhD Student in the UCF Center for Research in Computer Vision https://www.linkedin.com/in/madelineschiappa/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store