Research – PyTorch https://pytorch.org Mon, 05 May 2025 22:27:01 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.2 https://pytorch.org/wp-content/uploads/2024/10/cropped-favicon-32x32.webp Research – PyTorch https://pytorch.org 32 32 Using PyTorch for Monocular Depth Estimation Webinar https://www.youtube.com/watch?v=xf2QgioY370 Fri, 27 Sep 2024 05:09:44 +0000 https://pytorch.org/?p=2944 In this webinar, Bob Chesebrough of Intel guides you through the steps he took to create a clipped image with background clutter removed from the image. He accomplished this using monocular depth estimation with PyTorch. This could potentially be used to automate structure from motion and other image-related tasks where you want to highlight or focus on a single portion of an image, particularly for identifying parts of the image that were closest to the camera. 

]]>
IBM Research: Bringing massive AI models to any cloud https://research.ibm.com/blog/ibm-pytorch-cloud-ai-ethernet Thu, 17 Nov 2022 05:27:00 +0000 https://pytorch.org/?p=2969 The field of AI is in the middle of a revolution. In recent years, AI models have made images, songs, or even websites out of simple text prompts. These types of models with billions of parameters, called foundation models, can with little fine-tuning be repurposed from one task to another, removing countless hours of training and labelling, and refitting a model to take on a new task.

]]>
ChemicalX: A Deep Learning Library for Drug Pair Scoring https://arxiv.org/abs/2202.05240 Thu, 10 Feb 2022 05:42:00 +0000 https://pytorch.org/?p=2997 In this paper, we introduce ChemicalX, a PyTorch-based deep learning library designed for providing a range of state of the art models to solve the drug pair scoring task. The primary objective of the library is to make deep drug pair scoring models accessible to machine learning researchers and practitioners in a streamlined this http URL design of ChemicalX reuses existing high level model training utilities, geometric deep learning, and deep chemistry layers from the PyTorch ecosystem. 

]]>
The Why and How of Scaling Large Language Models https://www.youtube.com/watch?v=qscouq3lo0s Tue, 04 Jan 2022 05:43:00 +0000 https://pytorch.org/?p=2999 Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Over the past decade, the amount of compute used for the largest training runs has increased at an exponential pace. We’ve also seen in many domains that larger models are able to attain better performance following precise scaling laws. The compute needed to train these models can only be attained using many coordinated machines that are communicating data between them.

]]>
SearchSage: Learning Search Query Representations at Pinterest https://medium.com/pinterest-engineering/searchsage-learning-search-query-representations-at-pinterest-654f2bb887fc Tue, 09 Nov 2021 05:47:00 +0000 https://pytorch.org/?p=3003 Pinterest surfaces billions of ideas to people every day, and the neural modeling of embeddings for content, users, and search queries are key in the constant improvement of these machine learning-powered recommendations. Good embeddings — representations of discrete entities as vectors of numbers — enable fast candidate generation and are strong signals to models that classify, retrieve and rank relevant content.

]]>
Using a Grapheme to Phoneme Model in Cisco’s Webex Assistant https://blogs.cisco.com/developer/graphemephoneme01 Tue, 07 Sep 2021 05:52:00 +0000 https://pytorch.org/?p=3007 Grapheme to Phoneme (G2P) is a function that generates pronunciations (phonemes) for words based on their written form (graphemes). It has an important role in automatic speech recognition systems, natural language processing, and text-to-speech engines. In Cisco’s Webex Assistant, we use G2P modelling to assist in resolving person names from voice. See here for further details of various techniques we use to build robust voice assistants.

]]>
University of Pécs enables text and speech processing in Hungarian, builds the BERT-large model with just 1,000 euro with Azure https://www.microsoft.com/en/customers/story/1402696956382669362-university-of-pecs-higher-education-azure-en-hungary Tue, 10 Aug 2021 05:54:00 +0000 https://pytorch.org/?p=3011 Everyone prefers to use their mother tongue when communicating with chat agents and other automated services. However, for languages like Hungarian—spoken by only 15 million people—the market size will often be viewed as too small for large companies to create software, tools or applications that can process Hungarian text as input. Recognizing this need, the Applied Data Science and Artificial Intelligence team from University of Pécs decided to step up. 

]]>
How 3DFY.ai Built a Multi-Cloud, Distributed Training Platform Over Spot Instances with TorchElastic and Kubernetes https://medium.com/pytorch/how-3dfy-ai-built-a-multi-cloud-distributed-training-platform-over-spot-instances-with-44be40936361 Thu, 17 Jun 2021 05:56:00 +0000 https://pytorch.org/?p=3013 Deep Learning development is becoming more and more about minimizing the time from idea to trained model. To shorten this lead time, researchers need access to a training environment that supports running multiple experiments concurrently, each utilizing several GPUs.

]]>
AI21 Labs Trains 178-Billion-Parameter Language Model Using Amazon EC2 P4d Instances, PyTorch https://aws.amazon.com/solutions/case-studies/AI21-case-study-p4d/ Mon, 07 Jun 2021 05:57:00 +0000 https://pytorch.org/?p=3015 AI21 Labs uses machine learning to develop language models focused on understanding meaning, and in 2021 it set a goal to train the recently released Jurassic-1 Jumbo, an autoregressive language model with 178 billion parameters. Developers who register for beta testing will get access to Jurassic-1 Jumbo and can immediately start to customize the model for their use case. The software startup wanted to train the model efficiently, so it looked to Amazon Web Services (AWS) 

]]>
Deepset achieves a 3.9x speedup and 12.8x cost reduction for training NLP models by working with AWS and NVIDIA https://aws.amazon.com/blogs/machine-learning/deepset-achieves-a-3-9x-speedup-and-12-8x-cost-reduction-for-training-nlp-models-by-working-with-aws-and-nvidia/ Wed, 27 Jan 2021 06:33:00 +0000 https://pytorch.org/?p=3029 At deepset, we’re building the next-level search engine for business documents. Our core product, Haystack, is an open-source framework that enables developers to utilize the latest NLP models for semantic search and question answering at scale. Our software as a service (SaaS) platform, Haystack Hub, is used by developers from various industries, including finance, legal, and automotive, to find answers in all kinds of text documents. 

]]>