Generative AI
Posted by Pierre-Edouard Guerin · 2 min read · Published on May 27, 2024
I was fortunate to follow the course of Sven Warris about software tools to integrate genAI into your own work and applications. The course is aimed at data scientists and bioinformaticians.
Introduction
- Artificial Intelligence (AI): Technology aimed at mimicking human abilities like reasoning, learning, problem-solving, and perception.
- Machine Learning (ML): Uses algorithms and statistical models to execute tasks based on pattern recognition and inference, without explicit instructions.
- Statistics: The foundational science for analyzing and interpreting data, essential for AI development.
- Deep Learning (DL): A subset of ML that utilizes layered neural networks, excelling in complex tasks such as speech and image recognition.
- Generative AI: Focuses on creating new content (e.g., images, texts, sounds) from trained data, emphasizing the versatility in output types.
Process of Tokenizing and Embedding
- Tokenization: Breaking down text into smaller units called tokens e.g. words.
- Embedding: Converts tokens into numerical vectors that represent both the literal and contextual meanings of the tokens. This process is critical for machine understanding and manipulation of language.
Applications and Developments in Generative AI
- Prompt Engineering: Often referred to as prompt hacking, it involves crafting queries that guide AI to produce desired outcomes. Used in various applications from grant proposals to blog posts and image generation.
- Data Analytics and Application Integration: LLMs can be integrated with APIs like OpenWeather for real-time data processing and activity suggestions based on weather conditions.
- Automated Document Processing: Tools like Langchain and ChatGPT can process and generate structured outputs from documents such as scientific papers.
Prompt engineering using the ChatGPT web interface
The OpenAI API
text
Data analytics
Retrieval Augmented Generation (RAG)
Image generation and processing through the API
Langchain and large documents
References
- Generative AI workshop: Sven Warris doc
Published on May 27, 2024
Relevant Tags
About the Author
Latest Articles
-
Chado: the GMOD Database Schema
the Generic Model Organism Database project or GMOD is a collection of open source software tools for managing, visualising, storing, and disseminating genetic and genomic data.JAN 2025 · PIERRE-EDOUARD GUERIN -
Error Messages
I am an anxious person. So error messages always makes my heart beat faster. Hopefully, following the Pareto Principle, 80% of error messages are mild while 20% are the really tough one. The point is to solve the first kind as quickly as possible and effortless. To do so, allow the user to solve the issue by himself with clear messages and hints (in the case of errors related to input files or parameters). Clear presentation of the context and precise localization of the error in the code will save a lot of useless and tedious work to the developer. The time spared on the easy errors just by having better messages, then can be reallocated to the second kind of errors, the troublemakers.NOV 2024 · PIERRE-EDOUARD GUERIN -
Generative AI
I was fortunate to follow the course of Sven Warris about software tools to integrate genAI into your own work and applications. The course is aimed at data scientists and bioinformaticians.MAY 2024 · PIERRE-EDOUARD GUERIN