Presented at All Things Open RTP Meetup
Presented by Karthik Uppuluri, Fidelity
Title: Generative AI
Abstract: In this session, let us embark on a journey into the fascinating world of generative artificial intelligence. As an emergent and captivating branch of machine learning, generative AI has become instrumental in myriad of sectors, ranging from visual arts to creating software for technological solutions. This session requires no prior expertise in machine learning or AI. It aims to inculcate a robust understanding of fundamental concepts and principles of generative AI and its diverse applications. Join us as we delve into the mechanics of this transformative technology and unpack its potential.
Exploring Opportunities in the Generative AI Value Chain.pdfDung Hoang
The article "Exploring Opportunities in the Generative AI Value Chain" by McKinsey & Company's QuantumBlack provides insights into the value created by generative artificial intelligence (AI) and its potential applications.
Seminar on ChatGPT Large Language Model by Abhilash Majumder(Intel)
This presentation is solely for reading purposes and contains technical details about ChatGPT fundamentals
generative-ai-fundamentals and Large language modelsAdventureWorld5
Thank you for the detailed review of the protein bars. I'm glad to hear you and your family are enjoying them as a healthy snack and meal replacement option. A couple suggestions based on your feedback:
- For future orders, you may want to check the expiration dates to help avoid any dried out bars towards the end of the box. Freshness is key to maintaining the moist texture.
- When introducing someone new to the bars, selecting one in-person if possible allows checking the flexibility as an indicator it's moist inside. This could help avoid a disappointing first impression from a dry sample.
- Storing opened boxes in an airtight container in the fridge may help extend the freshness even further when you can't
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveHuahai Yang
Generative AI: Past, Present, and Future – A Practitioner's Perspective
As the academic realm grapples with the profound implications of generative AI
and related applications like ChatGPT, I will present a grounded view from my
experience as a practitioner. Starting with the origins of neural networks in
the fields of logic, psychology, and computer science, I trace its history and
align it within the wider context of the pursuit of artificial intelligence.
This perspective will also draw parallels with historical developments in
psychology. Against this backdrop, I chart a proposed trajectory for the future.
Finally, I provide actionable insights for both academics and enterprising
individuals in the field.
Unlocking the Power of Generative AI An Executive's Guide.pdfPremNaraindas1
Generative AI is here, and it can revolutionize your business. With its powerful capabilities, this technology can help companies create more efficient processes, unlock new insights from data, and drive innovation. But how do you make the most of these opportunities?
This guide will provide you with the information and resources needed to understand the ins and outs of Generative AI, so you can make informed decisions and capitalize on the potential. It covers important topics such as strategies for leveraging large language models, optimizing MLOps processes, and best practices for building with Generative AI.
The document discusses how generative AI can be used to scale content operations by reducing the time it takes to generate content. It explains that generative AI learns from natural language models and can generate new text or ideas based on prompts provided by users. While generative AI has benefits like speeding up content creation and ideation, it also has limitations such as not being able to conduct original research or ensure quality. The document provides examples of how generative AI can be used for tasks like generating ideas, simplifying complex text, creating visuals, and more. It also discusses challenges like bias in AI models and the low risk of plagiarism.
ChatGPT is a natural language processing model created by OpenAI that can generate human-like responses to text-based conversations. It uses deep learning and was pre-trained on vast amounts of text to understand language. Performance is evaluated using metrics like perplexity, accuracy, fluency and human evaluation. There are ethical concerns around copyright, personal data, bias and how the training data was obtained. OpenAI has introduced a paid ChatGPT Plus subscription with additional features while maintaining the free version.
This document discusses generative AI and its potential transformations and use cases. It outlines how generative AI could enable more low-cost experimentation, blur division boundaries, and allow "talking to data" for innovation and operational excellence. The document also references responsible AI frameworks and a pattern catalogue for developing foundation model-based systems. Potential use cases discussed include automated reporting, digital twins, data integration, operation planning, communication, and innovation applications like surrogate models and cross-discipline synthesis.
As an AI language model, ChatGPT is a program consisting of a large neural network that has been trained on vast amounts of textual data. Specifically, ChatGPT is a variant of the GPT (Generative Pre-trained Transformer) family of models developed by OpenAI.
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
Session 1
👉This first session will cover an introduction to Generative AI & harnessing the power of large language models. The following topics will be discussed:
Introduction to Generative AI & harnessing the power of large language models.
What’s generative AI & what’s LLM.
How are we using it in our document understanding & communication mining models?
How to develop a trustworthy and unbiased AI model using LLM & GenAI.
Personal Intelligent Assistant
Speakers:
📌George Roth - AI Evangelist at UiPath
📌Sharon Palawandram - Senior Machine Learning Consultant @ Ashling Partners & UiPath MVP
📌Russel Alfeche - Technology Leader RPA @qBotica & UiPath MVP
The document discusses generative models and their applications in artificial intelligence. Generative adversarial networks (GANs) use two neural networks, a generator and discriminator, that compete against each other. The generator learns to generate new data that looks real by fooling the discriminator, while the discriminator learns to better identify real from fake data. GANs have been used for tasks like image generation and neural style transfer. They show potential to generate art, music and other creative forms through machine learning.
Conversational AI and Chatbot IntegrationsCristina Vidu
Conversational AI and Chatbots (or rather - and more extensively - Virtual Agents) offer great benefits, especially in combination with technologies like RPA or IDP. Corneliu Niculite (Presales Director - EMEA @DRUID AI) and Roman Tobler (CEO @Routinuum & UiPath MVP) are discussing Conversational AI and why Virtual Agents play a significant role in modern ways of working. Moreover, Corneliu will be displaying how to build a Workflow and showcase an Accounts Payable Use Case, integrating DRUID and UiPath Robots.
📙 Agenda:
The focus of our meetup is around the following areas - with a lot of room to discuss and share experiences:
- What is "Conversational AI" and why do we need Chatbots (Virtual Agents);
- Deep-Dive to a DRUID-UiPath Integration via an Accounts Payable Use Case;
- Discussion, Q&A
Speakers:
👨🏻💻 Corneliu Niculite, Presales Director - EMEA DRUID AI
👨🏼💻 Roman Tobler, UiPath MVP, Co-Founder & CEO Routinuum GmbH
This session streamed live on March 8, 2023, 16:00 PM CET.
Check out our upcoming events at: community.uipath.com
Contact us at: community@uipath.com
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
An April 2023 presentation to the AMIA working group on natural language processing. The talk focuses on three current trends in NLP and how they apply in healthcare: Large language models, No-code, and Responsible AI.
In this session, you'll get all the answers about how ChatGPT and other GPT-X models can be applied to your current or future project. First, we'll put in order all the terms – OpenAI, GPT-3, ChatGPT, Codex, Dall-E, etc., and explain why Microsoft and Azure are often mentioned in this context. Then, we'll go through the main capabilities of the Azure OpenAI and respective usecases that might inspire you to either optimize your product or build a completely new one.
An Introduction to Generative AI - May 18, 2023CoriFaklaris1
For this plenary talk at the Charlotte AI Institute for Smarter Learning, Dr. Cori Faklaris introduces her fellow college educators to the exciting world of generative AI tools. She gives a high-level overview of the generative AI landscape and how these tools use machine learning algorithms to generate creative content such as music, art, and text. She then shares some examples of generative AI tools and demonstrate how she has used some of these tools to enhance teaching and learning in the classroom and to boost her productivity in other areas of academic life.
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
This talk delves into the extraordinary capabilities of the emerging technology of generative AI, outlining its recent history and emphasizing its growing influence on scientific endeavors. Through a series of practical examples tailored for researchers, we will explore the transformative influence of these powerful tools on scientific tasks such as writing, coding, data wrangling and literature review.
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck is loaded with easy-to-follow content, and intuitive design. Introduce the types and levels of artificial intelligence using the highly-effective visuals featured in this PPT slide deck. Showcase the AI-subfield of machine learning, as well as deep learning through our comprehensive PowerPoint theme. Represent the differences, and interrelationship between AI, ML, and DL. Elaborate on the scope and use case of machine intelligence in healthcare, HR, banking, supply chain, or any other industry. Take advantage of the infographic-style layout to describe why AI is flourishing in today’s day and age. Elucidate AI trends such as robotic process automation, advanced cybersecurity, AI-powered chatbots, and more. Cover all the essentials of machine learning and deep learning with the help of this PPT slideshow. Outline the application, algorithms, use cases, significance, and selection criteria for machine learning. Highlight the deep learning process, types, limitations, and significance. Describe reinforcement training, neural network classifications, and a lot more. Hit download and begin personalization. Our AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck are topically designed to provide an attractive backdrop to any subject. Use them to look like a presentation pro. https://bit.ly/3ngJCKf
This document discusses generative AI and several related concepts:
- Generative pre-trained transformers (GPT) can generate text through self-supervised pre-training and fine-tuning. GPT-3 demonstrated strong performance on many NLP tasks without explicit programming.
- Generative adversarial networks (GANs) use two neural networks, a generator and discriminator, that compete against each other to generate synthetic images resembling real images.
- Diffusion models add noise to training images and then reverse the process to generate new images. Stable Diffusion can generate images from text prompts.
- At Fidelity, researchers are applying generative models like NER4Opt for extracting optimization information from text and understanding Bloom's performance
This document summarizes an internship report submitted by Shikhar Srivastava to Eckovation about a machine learning internship. The internship focused on machine learning applications, algorithms, and implementations. Srivastava's project involved teaching a neural network to recognize handwritten text using the MNIST dataset. He used a random forest classifier algorithm for the project, which creates decision trees from random subsets of training data and aggregates the votes to determine classifications.
Machine learning is the subfield of computer science that, according to Arthur Samuel in 1959, gives "computers the ability to learn without being explicitly programmed.Evolved from the study of pattern recognition and computational learning theory in artificial intelligence,machine learning explores the study and construction of algorithms that can learn from and make predictions on data – such algorithms overcome following strictly static program instructions by making data-driven predictions or decisions,:2 through building a model from sample inputs. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or unfeasible; example applications include email filtering, detection of network intruders or malicious insiders working towards a data breach,Optical character recognition (OCR),learning to rank and computer vision.
The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data
This document provides an introduction to artificial intelligence (AI) and machine learning. It outlines two main approaches in AI programming: rules-based and machine learning. Rules-based AI uses rules defined by human experts, while machine learning learns from data without being explicitly programmed. Several examples are given of when each approach would be used, such as rules-based for alphabetizing songs but machine learning for predicting housing prices. The machine learning process is described along with examples of techniques like classification, clustering, and regression. Deep learning and frameworks like TensorFlow are introduced for building large neural networks. Real-world applications across various domains are highlighted.
This document provides an introduction to artificial intelligence (AI) and machine learning. It outlines two main approaches in AI programming: rules-based and machine learning. Rules-based AI uses rules defined by human experts, while machine learning learns from data without being explicitly programmed. Several examples are given of when each approach would be used, such as rules-based for alphabetizing songs but machine learning for predicting housing prices. The machine learning process and some common techniques like classification, clustering, and regression are then explained. Deep learning and frameworks like TensorFlow are introduced for building large neural networks. Real-world applications across various domains are discussed.
AI & ML in Defence Systems - Sunil ChomalSunil Chomal
Talk on Artificial Intelligence & Machine Learning in Defense Systems at ‘Tutorial cum workshop on AI&ML’ organized by IEEE Bombay Section in collaboration with the India Council during August 10-11, 2018.
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses machine learning paradigms including supervised learning, unsupervised learning, clustering, artificial neural networks, and more. It then discusses how supervised machine learning works using labeled training data for tasks like classification and regression. Unsupervised learning is described as using unlabeled data to find patterns and group data. Semi-supervised learning uses some labeled and some unlabeled data. Reinforcement learning provides rewards or punishments to achieve goals. Inductive learning infers functions from examples to make predictions for new examples.
When Deep Learning Meets Recommender SystemAsi Messica
Deep learning techniques have been increasingly applied to recommender systems. Some key applications discussed in the document include using word embeddings to learn vector representations of items, sequential models like GRU4REC to predict user sessions, and hybrid models that combine deep learning with collaborative filtering approaches. Exploration-exploitation techniques are also important to optimize for maximizing rewards in recommender systems, and Bayesian methods like dropout can help estimate uncertainty to inform exploration strategies.
Traditional Machine Learning had used handwritten features and modality-specific machine learning to classify images, text or recognize voices. Deep learning / Neural network identifies features and finds different patterns automatically. Time to build these complex tasks has been drastically reduced and accuracy has exponentially increased because of advancements in Deep learning. Neural networks have been partly inspired from how 86 billion neurons work in a human and become more of a mathematical and a computer problem. We will see by the end of the blog how neural networks can be intuitively understood and implemented as a set of matrix multiplications, cost function, and optimization algorithms.
Hot Topics in Machine Learning for Research and ThesisWriteMyThesis
Machine Learning is a hot topic for research for research. There are various good thesis topics in Machine Learning. Writemythesis provides thesis in Machine Learning along with proper guidance in this field. Find the list of thesis topics in this document.
http://www.writemythesis.org/master-thesis-topics-in-machine-learning/
In the last few years, deep learning has achieved significant success in a wide range of domains, including computer vision, artificial intelligence, speech, NLP, and reinforcement learning. However, deep learning in recommender systems has, until recently, received relatively little attention. This talks explores recent advances in this area in both research and practice. I will explain how deep learning can be applied to recommendation settings, architectures for handling contextual data, side information, and time-based models.
This document summarizes a research paper that proposes using machine learning models to detect phishing websites. It begins with an abstract that introduces the growing problem of phishing attacks and how machine learning can help address it. The document then provides more detail on the proposed methodology, which includes collecting a dataset of legitimate and phishing URLs, extracting features from the URLs, and training/comparing several machine learning models (decision trees, random forests, XGBoost, etc.). It finds that the XGBoost model performs best at classifying URLs as legitimate or phishing. The summary concludes by noting this approach could help create effective phishing detection tools, while acknowledging challenges like accounting for evolving attack types and adversarial examples.
Data locality and distribution
● massively decentralized, naturally arising
(non-IID) partition
● Data is siloed, held by a small number of
coordinating entities
● system-controlled (e.g. shuffled, balanced)
Data availability
● limited availability, time-of-day variations
● almost all data nodes
The document discusses using generative adversarial networks (GANs) for text-to-image generation. GANs involve two neural networks, a generator and discriminator, that compete against each other. The generator generates images from text descriptions, while the discriminator tries to distinguish real images from generated ones. The document outlines the network architecture, literature review on GAN improvements, methodology used which involves training the GAN on a dataset to generate high resolution images from low resolution inputs conditioned on text.
Building Reliability - The Realities of ObservabilityAll Things Open
Presented at the ATO RTP Meetup
Presented by Jeremy Proffit, Director of DevSecOps & SRE for Customer Care and Communications, Ally
Title: Building Reliability - The Realities of Observability
Abstract: Join me as we discuss true observability, learn what works and what doesn't. We'll not only discuss dashboards, monitoring and alerting, but how these can be built by automation or included in your IAC modules. We'll talk about how to properly alert staff based on priority to keep your staff and yourself sane. And even discuss architecture and how it impacts reliably and why serverless isn't always the best at being reliable.
Presented at the ATO RTP Meetup
Presented by Peter Zaitsev, Founder of Percona
Title: Modern Database Best Practices
Abstract: There are now more Database choices available for developers than ever before - there are general purpose databases and specialized databases, single node and distributed databases, Open Source, Proprietary databases and databases available exclusively in the cloud. In this presentation we will cover the best practices of choosing database(s) for your applications, best practices as it comes to application development as well as managing those databases to achieve best possible performance, security, availability at the lowest cost.
All Things Open 2023
Presented at All Things Open 2023
Presented by Deb Bryant - Open Source Initiative, Patrick Masson - Apereo Foundation, Stephen Jacobs - Rochester Institute of Technology, Ruth Suehle - SAS, & Greg Wallace - FreeBSD Foundation
Title: Open Source and Public Policy
Abstract: New regulations in the software industry and adjacent areas such as AI, open science, open data, and open education are on the rise around the world. Cyber Security, societal impact of AI, data and privacy are paramount issues for legislators globally. At the same time, the COVID-19 pandemic drove collaborative development to unprecedented levels and took Open Source software, open research, open content and data from mainstream to main stage, creating tension between public benefit and citizen safety and security as legislators struggle to find a balance between open collaboration and protecting citizens.
Historically, the open source software community and foundations supporting its work have not engaged in policy discussions. Moving forward, thoughtful development of these important public policies whilst not harming our complex ecosystems requires an understanding of how our ecosystem operates. Ensuring stakeholders without historic benefit of representation in those discussions becomes paramount to that end.
Please join our open discussion with open policy stakeholders working constructively on current open policy topics. Our panelists will provide a view into how oss foundations and other open domain allies are now rising to this new challenge as well as seizing the opportunity to influence positive changes to the public’s benefit.
Topics: Public Policy, Open Science, Open Education, current legislation in the US and EU, US interest in OSS sustainability, intro to the Open Policy Alliance
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Weaving Microservices into a Unified GraphQL Schema with graph-quilt - Ashpak...All Things Open
This document summarizes a presentation about graph-quilt, an open source GraphQL orchestrator library. It discusses the challenges of building a GraphQL orchestrator to unify data from multiple services. Graph-quilt addresses this by allowing services to register their GraphQL schemas and composing them into a unified schema. It also supports features like remote schema extensions, authorization, and adapting existing REST APIs. The presenters believe graph-quilt provides a flexible way to build GraphQL gateways and help more clients adopt GraphQL.
The State of Passwordless Auth on the Web - Phil NashAll Things Open
Presented at All Things Open 2023
Presented by Phil Nash - Sonar
Title: The State of Passwordless Auth on the Web
Abstract: Can we get rid of passwords yet? They make for a poor user experience and users are notoriously bad with them. The advent of WebAuthn has brought a passwordless world closer, but where do we really stand?
In this talk we'll explore the current user experience of WebAuthn and the requirements a user has to fulfil to authenticate without a password. We'll also explore the fallbacks and safeguards we can use to make the password experience better and more secure. By the end of the session you'll have a vision of how authentication could look in the future and a blueprint for how to build the best auth experience today.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Total ReDoS: The dangers of regex in JavaScriptAll Things Open
Presented at All Things Open 2023
Presented by Phil Nash - Sonar
Title: Total ReDoS: The dangers of regex in JavaScript
Abstract: Regular expressions are complicated and can be hard to learn. On top of that, they can also be a security risk; writing the wrong pattern can open your application up to denial of service attacks. One token out of place and you invite in the dreaded ReDoS.
But how can a regular expression cause this? In this talk we’ll track down the patterns that can cause this trouble, explain why they are an issue and propose ways to fix them now and avoid them in the future. Together we’ll demystify these powerful search patterns and keep your application safe from expressions that behave in a way that is anything but regular.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
What Does Real World Mass Adoption of Decentralized Tech Look Like?All Things Open
Presented at All Things Open 2023
Presented by Karl Mozurkewich - Storj
Title: What Does Real World Mass Adoption of Decentralized Tech Look Like?
Abstract: We delve into the transformative potential of decentralized technology. Beginning with a brief overview of the rise of centralization with the advent of the internet and the counter-shift marked by blockchain we explore the intrinsic characteristics of decentralized and distributed systems, such as trustless operations, peer-to-peer networks, and enterprise application scalability. Various sectors, including finance, supply chains, media and entertainment, data science and cloud infrastructure are on the brink of disruption. The societal implications are vast, with the potential for greater individual empowerment, a greener planet and more viable resource utilization, but concerns about data security persist.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Presented at All Things Open 2023
Presented by Anastasia Lalamentik - Kaleido
Title: How to Write & Deploy a Smart Contract
Abstract: In this talk, Anastasia Lalamentik, Full Stack Engineer at Kaleido, will walk through how Ethereum smart contracts work and go over related concepts like gas fees, the Ethereum Virtual Machine (EVM), the block explorer, and the Solidity programming language. This is vital to anyone who wants to build a blockchain app and is a great introduction to blockchain technology for newcomers to the space.
By the end of the talk, attendees will better understand how to:
- Write a simple smart contract
- Deploy their smart contract to an Ethereum test network through the latest tools like Hardhat and the MetaMask wallet
- Test interactions with their deployed smart contract and ensure that everything is working properly
Additionally, participants will get to interact with Anastasia's deployed smart contract at the end of the talk. Anastasia’s past talks have attracted and have been attended by a diverse group of participants with a range of experience in the space.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlowAll Things Open
Presented at All Things Open 2023
Presented by Paul Brebner - Instaclustr (by Spot by NetApp)
Title: Spinning Your Drones with Cadence Workflows, Apache Kafka and TensorFlow
Abstract: In this talk we’ll build a Drone delivery application, and then use it to do some Machine Learning “on the fly”.
In the 1st part of the talk, we'll build a real-time Drone Delivery demonstration application using a combination of two open-source technologies: Uber’s Cadence (for stateful, scheduled, long-running workflows), and Apache Kafka (for fast streaming data).
With up to 2,000 (simulated) drones and deliveries in progress at once this application generates a vast flow of spatio-temporal data.
In the 2nd part of the talk, we'll use this platform to explore Machine Learning (ML) over streaming and drifting Kafka data with TensorFlow to try and predict which shops will be busy in advance.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Presented at the All Things Open 2023 Inclusion and Diversity in Open Source Event
Presented by Efraim Marquez-Arreaza - Red Hat
Title: DEI Challenges and Success
Abstract: In today's world, many companies and organizations have Diversity, Equity and Inclusion (DEI) communities. Red Hat Unidos is a DEI community focused on advocating for the Hispanic/Latine community. In this talk, we would like to share our challenges and success during the past 4-years and plans for the future.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Presented at All Things Open 2023
Presented by Lydia Cupery - HubSpot
Title: Scaling Web Applications with Background Jobs: Takeaways from Generating a Huge PDF
Abstract: Do you need to perform time-consuming or CPU-intensive processes in your web application but are concerned about performance? That’s where background jobs come in. By offloading resource-intensive tasks to separate worker processes, you can improve the scalability of your web application.
In this talk, I'll share my experience of using background jobs to scale our web application. I'll discuss the challenges my team faced that led us to adopt background jobs. Then, I'll share practical tips on how to design background jobs for CPU-intensive or time-consuming processes, such as generating huge PDFs and batch emailing. I'll wrap up by going over the performance and cost tradeoffs of background jobs.
I'll use Typescript, Express, and Heroku as examples in this talk, but the concepts and best practices that I'll share are applicable to other languages and tools.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Presented at All Things Open 2023
Presented by Robert Aboukhalil - CZI
Title: Supercharging tutorials with WebAssembly
Abstract: sandbox.bio is a free platform that features interactive command-line tutorials for bioinformatics. This talk is a deep-dive into how sandbox.bio was built, with a focus on how WebAssembly enabled bringing command-line tools like awk and grep to the web. Although these tools were originally written in C/C++, they all run directly in the browser, thanks to WebAssembly! And since the computations run on each user's computer, this makes the application highly scalable and cost-effective.
Along the way, I'll discuss how WebAssembly works and how to get started using it in your own applications. The talk will also cover more advanced WebAssembly features such as threads and SIMD, and will end with a discussion of WebAssembly's benefits and pitfalls (it's a powerful technology, but it's not always the right tool!).
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Presented at All Things Open 2023
Presented by K.S. Bhaskar - YottaDB LLC
Title: Using SQL to Find Needles in Haystacks
Abstract: Database journal files capture every update to a database. A database of a few hundred GB can generate GBs worth of journal files every minute at busy times. Troubleshooting and forensices, especially of rare and intermittent problems, such as which process made what update and when, is an exercise of finding needles in haystacks. A similar problem exists with syslogs. A solution is to load the journal files and syslogs into a database, and use SQL to query the database. Bhaskar will present and demonstrate this with a 100% FOSS stack.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Configuration Security as a Game of Pursuit InterceptAll Things Open
The document discusses configuration security as a game of pursuit-evasion and intercept. It was presented by Wes Widner, Principal Engineer at Automox. The document includes a JSON policy snippet with an ID, statement, actions, effects, resources, and principal allowing the GetObject action on all objects in an S3 bucket for all principals. It has page numbers at the bottom indicating it is from a larger presentation.
Presented at All Things Open 2023
Presented by Carol Huang & Mike Fix - Stripe
Title: Scaling an Open Source Sponsorship Program
Abstract: We already know this: the open-source ecosystem needs further monetary investment from the companies that benefit most from it. Likewise, companies say they want to participate in these initiatives, but find it hard to dedicate resources to open source funding when there isn’t a clear ROI.
This talk discusses how the Open Source Program Office at Stripe built a scalable, sustainable open source sponsorship model that aligns internal company incentives with those of open source maintainers and the community at large. We go over the unique “platformization” of our OSPO that allowed us to create multiple funding models, such as BYOB (Bring Your Own Budget), and share lessons learned from this experience as well as other OSPOs.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Build Developer Experience Teams for Open SourceAll Things Open
Presented at All Things Open 2023
Presented by Arundeep Nagaraj - Amazon Web Services (AWS)
Title: Build Developer Experience Teams for Open Source
Abstract: Open Source has become the default strategy for many IT organizations and Enterprises. However, the constant challenge with Open Source leaders of these organizations has been -
How is my product's developer experience?
Is this the right metric to track?
How can I scale my team to support our products better?
How can I add automation to scale redundant workflows?
If my product involves working with developers, how can I scale to the complexity of the requests and reduce Engineering bandwidth?
The challenges within support of open source products continues to magnify depending on the end user persona whether they are consumers or contributors to your product. Consumers utilize your product, SDK's and API's and are blocked with using it or run into issues, whereas contributors are advanced users of your software that understands the codebase to provide a meaningful contribution back to the product.
The answer to the above is to look at Open Source support as a first-class citizen of your corporate support strategy. To employ the right level of developer focused support as opposed to traditional infrastructure based support is key to scale to the amount of developers using your product. Supporting customers in the open involves more than pure support - building customer / developer experiences (DX) in the open (across platforms and communities) that pivots over the ability of your product's users or developers to be focused on the end-to-end value add. This helps with your active developer growth and retention of users.
Key Takeaways:
- IT leaders of Open Source will learn to employ strategies to build a DX team that engages on multiple platforms
- Work on identifying accurate metrics for product and organization
- Innovate on platforms such as Discord to build a bot and a dashboard
- Ability to leverage customer feedback and iterate over the customer success flywheel
- Distinguish between DX and Developer Advocacy (DA)
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Presented at All Things Open 2023
Presented by Danny McCormick - Google
Title: Deploying Models at Scale with Apache Beam
Abstract: Apache Beam is an open source tool for building distributed scalable data pipelines. This talk will explore how Beam can be used to perform common machine learning tasks, with a heavy focus on running inference at scale. The talk will include a demo component showing how Beam can be used to deploy and update models efficiently on both CPUs and GPUs for inference workloads.
An attendee can expect to leave this talk with a high level understanding of Beam, the challenges of deploying models at scale, and the ability to use Beam to easily parallelize their inference workloads.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Sudo – Giving access while staying in controlAll Things Open
Presented at All Things Open 2023
Presented by Peter Czanik - One Identity
Title: Sudo – Giving access while staying in control
Abstract: Sudo is used by millions to control and log administrator access to systems, but using the default configuration only, there are plenty of blind spots. Using the latest features in sudo let you watch some previously blind spots and control access to them. Here are four major new features, which arrived since the 1.9.0 release, allowing you see your blind spots:
- configuring a working directory or chroot within sudo often makes full shell access redundant
- JSON-formatted logs give you more details on events and are easier to act on
- relays in sudo_logsrvd make session recording collection more secure and reliable
- you can log and control sub-commands executed by the command run through sudo
Let us take a closer look at each of these.
Previously, there were quite a few situations where you had to give users full shell access through sudo. Typical examples include when you need to run a command from a given directory, or running commands in a chroot environment. You can now configure the working directory or the chroot directory and give access only to the command the user really needs.
Logging is a central role of sudo, to see who did what on the system. Using JSON-formatted log messages gives you even more information about events. What is even more: structured logs are easier to act on. Setting up alerting for suspicious events is much easier when you have a single parser to configure for any kind of sudo logs. You can collect sudo logs not only by local syslog, but also by using sudo_logsrvd, the same application used to collect session recordings.
Speaking of session recordings: instead of using a single central server, you can now have multiple levels of sudo_logsrvd relays between the client and the final destination. This allows session collection even if the central server is unavailable, providing you with additional security. It also makes your network configuration simpler.
Finally, you can log sub-commands executed from the command started through sudo. You can see commands started from a shell. No more unnoticed shell access from text editors. Best of all: you can also intercept sub-commands.
These are just a few of the most prominent features helping you to watch and control previous blind spots on your systems. See these and other possibilities in action in some live demos during our presentation.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Fortifying the Future: Tackling Security Challenges in AI/ML ApplicationsAll Things Open
Presented at All Things Open 2023
Presented by Christine Abernathy - F5, Inc.
Title: Fortifying the Future: Tackling Security Challenges in AI/ML Applications
Abstract: As Artificial Intelligence (AI) and Machine Learning (ML) applications continue to surge, it is crucial to be aware of and address the security risks associated with these technologies. In this talk, Christine will explore AI/ML failure modes, threats, and mitigation strategies. She will guide you through the fundamentals of ML models then introduce you to key security challenges such as adversarial attacks, data poisoning, model inversion, model stealing, and membership inference attacks, using real-world examples to demonstrate their potential impact.
Christine will also discuss privacy and ethical considerations in ML, touching upon techniques like federated learning and shedding light on the current regulatory landscape surrounding security risks. If you are developing AI/ML applications or incorporating AI/ML components into your technology stack, check out this talk. You will walk away with a deeper understanding of the current AI/ML security landscape and a toolkit to help you address these risks, enabling you to build safer, more secure, and privacy-aware applications.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Securing Cloud Resources Deployed with Control Planes on Kubernetes using Gov...All Things Open
Presented at All Things Open 2023
Presented by Carlos Santana - AWS
Title: Securing Cloud Resources Deployed with Control Planes on Kubernetes using Governance and Policy as Code
Abstract: Are you concerned about the security of your cloud resources deployed on Kubernetes? Are you struggling to ensure compliance with regulatory requirements while managing your cloud infrastructure? If yes, then this talk is for you!
We will discuss how to secure cloud resources deployed with Crossplane on Kubernetes using Governance and Policy as Code. We will explore how to leverage Governance and Policy as Code tools like Rego, Kyverno, and OPA to ensure security and compliance.
By the end of this talk, you will have a better understanding of the challenges associated with securing cloud resources deployed with Crossplane or ACK on Kubernetes, the importance of Governance and Policy as Code in ensuring security and compliance, and why it is critical to use open source and open standards in these technologies.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Jacquard Fabric Explained: Origins, Characteristics, and Usesldtexsolbl
In this presentation, we’ll dive into the fascinating world of Jacquard fabric. We start by exploring what makes Jacquard fabric so special. It’s known for its beautiful, complex patterns that are woven into the fabric thanks to a clever machine called the Jacquard loom, invented by Joseph Marie Jacquard back in 1804. This loom uses either punched cards or modern digital controls to handle each thread separately, allowing for intricate designs that were once impossible to create by hand.
Next, we’ll look at the unique characteristics of Jacquard fabric and the different types you might encounter. From the luxurious brocade, often used in fancy clothing and home décor, to the elegant damask with its reversible patterns, and the artistic tapestry, each type of Jacquard fabric has its own special qualities. We’ll show you how these fabrics are used in everyday items like curtains, cushions, and even artworks, making them both functional and stylish.
Moving on, we’ll discuss how technology has changed Jacquard fabric production. Here, LD Texsol takes center stage. As a leading manufacturer and exporter of electronic Jacquard looms, LD Texsol is helping to modernize the weaving process. Their advanced technology makes it easier to create even more precise and complex patterns, and also helps make the production process more efficient and environmentally friendly.
Finally, we’ll wrap up by summarizing the key points and highlighting the exciting future of Jacquard fabric. Thanks to innovations from companies like LD Texsol, Jacquard fabric continues to evolve and impress, blending traditional techniques with cutting-edge technology. We hope this presentation gives you a clear picture of how Jacquard fabric has developed and where it’s headed in the future.
Webinar: Transforming Substation Automation with Open Source SolutionsDanBrown980551
This webinar will provide an overview of open source software and tooling for digital substation automation in energy systems. The speakers will provide a brief overview of how open source collaborative development works in general, then delve into how it is driving innovation and accelerating the pace of substation automation. Examples of specific open source solutions and real-world implementations by utilities will be discussed. Participants will walk away with a better understanding of the challenges of automating substations, the ecosystem of solutions available to help, and best practices for implementing them.
Connecting Attitudes and Social Influences with Designs for Usable Security a...Cori Faklaris
Many system designs for cybersecurity and privacy have failed to account for individual and social circumstances, leading people to use workarounds such as password reuse or account sharing that can lead to vulnerabilities. To address the problem, researchers are building new understandings of how individuals’ attitudes and behaviors are influenced by the people around them and by their relationship needs, so that designers can take these into account. In this talk, I will first share my research to connect people’s security attitudes and social influences with their security and privacy behaviors. As part of this, I will present the Security and Privacy Acceptance Framework (SPAF), which identifies Awareness, Motivation, and Ability as necessary for strengthening people’s acceptance of security and privacy practices. I then will present results from my project to trace where social influences can help overcome obstacles to adoption such as negative attitudes or inability to troubleshoot a password manager. I will conclude by discussing my current work to apply these insights to mitigating phishing in SMS text messages (“smishing”).
Ensuring Secure and Permission-Aware RAG DeploymentsZilliz
In this talk, we will explore the critical aspects of securing Retrieval-Augmented Generation (RAG) deployments. The focus will be on implementing robust secured data retrieval mechanisms and establishing permission-aware RAG frameworks. Attendees will learn how to ensure that access control is rigorously maintained within the model when ingesting documents, ensuring that only authorized personnel can retrieve data. We will also discuss strategies to mitigate risks of data leakage, unauthorized access, and insider threats in RAG deployments. By the end of this session, participants will have a clearer understanding of the best practices and tools necessary to secure their RAG deployments effectively.
DefCamp_2016_Chemerkin_Yury-publish.pdf - Presentation by Yury Chemerkin at DefCamp 2016 discussing mobile app vulnerabilities, data protection issues, and analysis of security levels across different types of mobile applications.
Using ScyllaDB for Real-Time Write-Heavy WorkloadsScyllaDB
Keeping latencies low for highly concurrent, intensive data ingestion
ScyllaDB’s “sweet spot” is workloads over 50K operations per second that require predictably low (e.g., single-digit millisecond) latency. And its unique architecture makes it particularly valuable for the real-time write-heavy workloads such as those commonly found in IoT, logging systems, real-time analytics, and order processing.
Join ScyllaDB technical director Felipe Cardeneti Mendes and principal field engineer, Lubos Kosco to learn about:
- Common challenges that arise with real-time write-heavy workloads
- The tradeoffs teams face and tips for negotiating them
- ScyllaDB architectural elements that support real-time write-heavy workloads
- How your peers are using ScyllaDB with similar workloads
Project Delivery Methodology on a page with activities, deliverablesCLIVE MINCHIN
I've not found a 1 pager like this anywhere so I created it based on my experiences. This 1 pager details a waterfall style project methodology with defined phases, activities, deliverables, assumptions. There's nothing in here that conflicts with commonsense.
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...Juan Carlos Gonzalez
Session delivered at the Microsoft 365 Chicago Community Days where I introduce how governance controls within SharePoint Premium are a key asset in a succesfull rollout of Copilot for Microsoft 365. The session was mostly a hands on session with multiple demos as you can see in the session recording available in YouTube: https://www.youtube.com/watch?v=MavcP6k5nU8&t=199s. For more information about Governance controls available in SharePoint Premium visit official documentation available at Microsoft Learn: https://learn.microsoft.com/en-us/sharepoint/advanced-management
Airports, banks, stock exchanges, and countless other critical operations got thrown into chaos!
In an unprecedented event, a recent CrowdStrike update had caused a global IT meltdown, leading to widespread Blue Screen of Death (BSOD) errors, and crippling 8.5 million Microsoft Windows systems.
What triggered this massive disruption? How did Microsoft step in to provide a lifeline? And what are the next steps for recovery?
Swipe to uncover the full story, including expert insights and recovery steps for those affected.
Welcome to our third live UiPath Community Day Amsterdam! Come join us for a half-day of networking and UiPath Platform deep-dives, for devs and non-devs alike, in the middle of summer ☀.
📕 Agenda:
12:30 Welcome Coffee/Light Lunch ☕
13:00 Event opening speech
Ebert Knol, Managing Partner, Tacstone Technology
Jonathan Smith, UiPath MVP, RPA Lead, Ciphix
Cristina Vidu, Senior Marketing Manager, UiPath Community EMEA
Dion Mes, Principal Sales Engineer, UiPath
13:15 ASML: RPA as Tactical Automation
Tactical robotic process automation for solving short-term challenges, while establishing standard and re-usable interfaces that fit IT's long-term goals and objectives.
Yannic Suurmeijer, System Architect, ASML
13:30 PostNL: an insight into RPA at PostNL
Showcasing the solutions our automations have provided, the challenges we’ve faced, and the best practices we’ve developed to support our logistics operations.
Leonard Renne, RPA Developer, PostNL
13:45 Break (30')
14:15 Breakout Sessions: Round 1
Modern Document Understanding in the cloud platform: AI-driven UiPath Document Understanding
Mike Bos, Senior Automation Developer, Tacstone Technology
Process Orchestration: scale up and have your Robots work in harmony
Jon Smith, UiPath MVP, RPA Lead, Ciphix
UiPath Integration Service: connect applications, leverage prebuilt connectors, and set up customer connectors
Johans Brink, CTO, MvR digital workforce
15:00 Breakout Sessions: Round 2
Automation, and GenAI: practical use cases for value generation
Thomas Janssen, UiPath MVP, Senior Automation Developer, Automation Heroes
Human in the Loop/Action Center
Dion Mes, Principal Sales Engineer @UiPath
Improving development with coded workflows
Idris Janszen, Technical Consultant, Ilionx
15:45 End remarks
16:00 Community fun games, sharing knowledge, drinks, and bites 🍻
Discover practical tips and tricks for streamlining your Marketo programs from end to end. Whether you're new to Marketo or looking to enhance your existing processes, our expert speakers will provide insights and strategies you can implement right away.
Leading Bigcommerce Development Services for Online RetailersSynapseIndia
As a leading provider of Bigcommerce development services, we specialize in creating powerful, user-friendly e-commerce solutions. Our services help online retailers increase sales and improve customer satisfaction.
2. Introduction
Artificial Intelligence, Machine Learning and Deep Learning
q The term AI in the broadest sense refers to simulation of human intelligence processes by
computer systems
q Machine Learning is a subset of AI focusses on designing specific systems which can learn
and make decisions/predictions based on data.
q Deep Learning is a subset of Machine Learning that uses a specific set of algorithms known
as Neural-Networks often with many layers.
For Wifi use Greenline
3. Introduction
Types of Machine Learning Models
q Supervised Learning
Supervised Learning is a type of Machine Learning model trained on
labeled data
Email Spam Classification Model
Data: Examples of emails either tagged as Spam or not Spam
Training:
Discriminative – Learns the boundary that separates “spam” vs “not spam”
Generative – Learns the distribution of “spam” and “not spam” emails to
understand how each class generates content
Inference
Discriminative – Determine on which side of the boundary a new email falls
Generative – Based on learned distributions compute the likelihood of the
new email being “spam” vs “not spam”
For Wifi use Greenline
4. Introduction
Types of Machine Learning Models
q Unsupervised Learning
Unsupervised Learning is a type of Machine Learning model that
identifies patterns and structures within un-labelled data
Email Topic Modeling
Data: A large collection of emails you may want to organize by
subject matter
Training:
Learn the distribution that generates the structure within the data
Inference
- Assign new email to the cluster where they have the highest
probability of belonging
For Wifi use Greenline
5. Introduction
Types of Machine Learning Models
q Reinforcement Learning
Interaction: Agent interacts with the environment by choosing actions from its current policy
A self-driving car decides to take a left or a right based on its current strategy and current state
of the road
Reward/Penalty: After each action, agent receives a reward/penalty which reflects the
success of the action
If the car safely navigates traffic or obeys rules, it’s a success
Policy Update: Agent updates the policy based on feedback received aiming to maximize the
total reward over time.
Based on the reward/penalty received car adjusts its driving policy, actions with positive
rewards will be repeated and negative rewards will be avoided
q Shallow and Deep Models
Models with limited layers and capable of capturing only linear and simple
nonlinear relationships are called shallow models
Models with many layers and capable of capturing complex hierarchical patterns
are called deep models
For Wifi use Greenline
7. Generative AI
GPT, GAN and Diffusion Models
Applications of Generative AI
Emerging Trends, Limitations, Potential Ahead
For Wifi use Greenline
8. Generative AI
Definition
Generative AI refers to a set of artificial intelligence methodologies that
can produce novel content that resembles the training data they were
exposed to.
For Wifi use Greenline
9. Generative AI
Generative Pre-Trained Transformer (GPT) - Motivation
Issues with CNNs, RNNs, LSTMs
- Convolutional Neural Networks (CNNs) are good at local feature extraction and struggle to
understand long-range dependencies in data
- CNNs do not have a mechanism to understand the order of elements making it harder for problems
involving text and time-series
- RNNs especially LSTMs can handle long range dependencies due to their ability to process data
sequentially. But as the sequences get longer, they struggle from vanishing gradient problems
- CNNs, RNNs, LSTMs are suitable for specific data types and are not efficient at handling multi-modal
inputs
What if you can completely avoid recurrent connections, thereby avoiding vanishing gradient issues?
For Wifi use Greenline
10. Generative AI
Generative Pre-Trained Transformer (GPT) - Motivation
• A new architecture called Transformers is proposed
by scientists from Google which avoids the recurrent
connections altogether by relying on an operation
known as attention
• This architecture also takes care of sequential nature
of inputs by using positional embeddings
https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
For Wifi use Greenline
11. Generative AI
Generative Pre-Trained Transformer (GPT) - Attention
Let's take an example sentence.
Alice, who has a black cat, loves going to park
- When the model is processing the word “loves”, attention
mechanism allows it to associate it with “Alice”
- At each word, attention mechanism allows to look at words at
other positions in the input sequence to better encode the word at
current position
1. At each input position, calculate query, key and value vectors (a
linear transformation of embeddings using learnt weight matrices)
2. Compute dot product between each query and all the keys in the
input sequence (attention)
3. Compute a weighted sum of all value vectors using attention
weights as coefficients
https://arxiv.org/pdf/1706.03762.pdf
http://jalammar.github.io/illustrated-transformer/
For Wifi use Greenline
12. Generative AI
Generative Pre-Trained Transformer (GPT) –Transformers Architecture
https://arxiv.org/pdf/1706.03762.pdf
http://jalammar.github.io/illustrated-transformer/
https://cs182sp21.github.io/static/slides/lec-12.pdf
Architecture
• Six Encoder layers stacked
• Six Decoder layers stacked
• Positional Embeddings
• Masked Attention (Encoder-
Decoder Attention)
Advantages
• Better long-range connections
• Easier to parallelize
• Can make the networks much
deeper (more layers) than RNNs
For Wifi use Greenline
13. Generative AI
Generative Pre-Trained Transformer (GPT)
A Generative-Pre-Trained Transformer is a kind of
transformers model developed by OpenAI for natural
language processing tasks
- Generative refers to the model’s ability to generate
text
- Pre-Trained refers to models training process
consisting of two stages
- Pre-Training: Model is trained on a large corpus
of text data, where the objective is to predict
next word in a sentence
- Fine-tuning: Once the model is pre-trained the
model can be fine-tuned on a specific task with a
task-specific dataset with supervised learning
https://arxiv.org/pdf/2005.14165.pdf https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
For Wifi use Greenline
14. Generative AI
Generative Pre-Trained Transformer (GPT) – GPT 3 Training Data and Parameters
https://arxiv.org/pdf/2005.14165.pdf
Dataset
Parameters
For Wifi use Greenline
15. Generative AI
Generative Pre-Trained Transformer (GPT) – GPT 3 Unreasonable Effectiveness
https://arxiv.org/pdf/2005.14165.pdf
For Wifi use Greenline
TriviaQA is a reading comprehension
dataset containing over 650K question-
answer-evidence triples.
https://nlp.cs.washington.edu/triviaqa/
16. Generative AI
Generative Pre-Trained Transformer (GPT) – LLM Landscape
https://arxiv.org/pdf/2304.13712.pdf https://amatriain.net/blog/transformer-models-an-introduction-and-catalog-2d1e9039f376/
For Wifi use Greenline
Encoder Models: These models map input
sequences to a vector representation.
Useful for extracting features (BERT)
Decoder Models: These models generate
an output sequence from a fixed length
input vector. Useful for generation text,
images etc. (GPT-3)
Encoder-Decoder Models: These models
are a combination of both encoder and
decoder. Encoder is responsible for
mapping input into vector and decoder
generates output sequence from that
vector. (BART/ T5/ FLAN UL2)
17. Generative AI
Generative Pre-Trained Transformer (GPT) – Chain-of-thought Prompting
https://arxiv.org/pdf/2201.11903.pdf
Chain-of-Thought Prompting is a technique that enables LLMs to complex reasoning by
generating a chain-of-thought, a series of intermediate reasoning steps.
Prompting
Datasets and Example
Problems
Performance
For Wifi use Greenline
18. Generative AI
Generative Pre-Trained Transformer (GPT) – Alignment - RLHF
https://arxiv.org/pdf/2212.08073.pdf
• Reinforcement Learning through Human Feedback is
technique that allows models to learn directly from human
feedback (like prompting) without the need for labeled data
• Due to the nature of training data being scrapped from internet
(contains a lot of mis-information, conspiracy theories etc..) the
models must be further polished/aligned using RLHF to make
it user appropriate
https://huyenchip.com/2023/05/02/rlhf.html
For Wifi use Greenline
19. Generative AI
Generative Adversarial Networks (GANs)
https://arxiv.org/pdf/1406.2661.pdf
Imagine you have a bunch of cat images, and you want a machine learning model to create similar images.
This is exactly what a GAN does.
Generator: Takes in random numbers as input and generates the images of interest (the forger)
Discriminator: Takes both the images from the generator and the real images from the data and spots the
difference between them (the detective)
Both the generator and the discriminator are trained together. And, over the duration of training, the
generator gets better at creating images which look real, and the discriminator gets better at spotting fakes.
Adversarial Objective: These two networks are pitted against each other where the generator creates more
realistic synthetic images to fool the discriminator while the discriminator networks tries to get better at
detecting fake images. This back-and-forth strategy forces both the networks to improve until the generator
can create highly realistic synthetic images, that indistinguishable from real images
For Wifi use Greenline
20. Generative AI
Diffusion Models
Diffusion models are another class of Generative models which work by adding noise to the images in the
training data by a process called forward diffusion process and then reversing the process to recover the
original image using reverse diffusion. These models can be trained on large unlabeled datasets in an
unsupervised manner.
Stable Diffusion: Stable Diffusion is a text-to-image model. A stable diffusion model has four important
elements
- Diffusion Probabilistic Model
- U-Net Architecture
- Latent Text Encoding
- Classifier-Free Guidance
https://stablediffusionweb.com/
For Wifi use Greenline
24. Applications
Midjourney
Prompt: Imagine a small seed planted in the ground. It
sprouts, grows into a sapling, then a small tree, and finally
a large robust tree. Each year, it sprouts new branches,
leaves and sometimes fruits – all from that small seed. This
is how your investment grows with compounding – It
branches out producing more and more just like a tree
Images Generated for this presentation
For Wifi use Greenline
26. Generative AI at Fidelity
Ner4Opt: Named Entity Recognition for Optimization Modelling
from Natural Language
q Envision automated modeling assistant
to turn natural language into optimization
formulations
q Necessary building block: finding key
pieces of information relevant to
optimization
q Ner4Opt: extracting optimization-related
information such as the objective,
constraints, and variables from free-form
natural language text
https://link.springer.com/chapter/10.1007/978-3-031-33271-5_20 https://huggingface.co/spaces/skadio/Ner4Opt
For Wifi use Greenline
27. Generative AI at Fidelity
Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language
https://link.springer.com/chapter/10.1007/978-3-031-33271-5_20 For Wifi use Greenline
28. Generative AI at Fidelity
Ner4Opt: Named Entity Recognition for Optimization Modelling from Natural Language
https://link.springer.com/chapter/10.1007/978-3-031-33271-5_20 https://nl4opt.github.io/results/
For Wifi use Greenline
29. Generative AI at Fidelity
Understanding BLOOM: An empirical study on diverse NLP tasks
Compare the Open-Source BLOOM with other models like BERT/GPT
• Does performance of BLOOM scale with parameters?
Authors noticed that performance of BLOOM doesn’t scale with
parameter size unlike models like BERT
• Does finetuning improve the performance?
Authors added multiple zero-shot cross-lingual and multi-lingual fine-
tuning experiments suggesting BLOOM is at par or worse than
monolingual GPT-2 models
• What about toxicity in the generated data?
Toxicity analysis of prompt-based text generation using the RealToxicity
Prompts dataset shows that the text generated by BLOOM is at least
17% less toxic than GPT-2 and GPT-3 models.
https://arxiv.org/pdf/2211.14865.pdf
For Wifi use Greenline
30. Generative AI at Fidelity
Correcting Semantic Parses with Natural Language through Dynamic Schema Encoding
https://arxiv.org/pdf/2305.19974.pdf
• There are several semantic and syntactic challenges in
converting Natural Language Text to SQL queries
• In this paper, authors approach Semantic Parse
Correction using Natural Language Feedback
• With just one-turn of correction, authors saw an
improvement of accuracy up to 26%
• They also show that a base T-5 model can correct the
errors of a T-5 large model in a zero-shot cross parser
setting.
For Wifi use Greenline
31. Generative AI
• Major breakthroughs in deep learning architectures like Transformers and Generative Adversarial
Networks
• Availability of massive datasets and GPU/TPU compute
• New advances in techniques like RLHF/Prompting made it much easy to align these models
• Low barrier of entry due to intuitive and user-friendly interfaces and strong open-source ecosystem
• GenAI holds potential to create photo-realistic images, human-like speech and text and generate
working code from natural language descriptions which was not possible until recently
Emerging Trends
For Wifi use Greenline
32. Things to Keep in Mind
1. Lack of Consistency (Hallucination): LLMs tend to produce wildly different answers, when the
same question is asked multiple times
2. Bias: As the models are trained on data scrapped from internet, they might have inherited the
biases present in the training data
3. Interpretability: It is difficult to understand why a particular response or content is generated,
making it very challenging for use cases where explainability is inherently required.
4. Real-time Knowledge: As the models are trained on a fixed dataset at a particular point in time,
they lack information/changes that occurred after that point.
5. Memory: Even though these models are getting good with context lengths that can be supported,
having an efficient memory remembering the important details of conversations over a long period
of time is still a challenging task.
6. Engineering Challenges: Operating these semi-non-deterministic models especially in a multi-
model setting (including voice, text, images etc..) at scale remains a significant challenge
For Wifi use Greenline
33. Potential of Generative AI
https://www.forbes.com/sites/bernardmarr/2023/05/31/the-future-of-generative-ai-beyond-chatgpt/?sh=161c85da3da9
1. Low Resource Languages – Ability to understand, generate any language, especially low resource ones,
could help study languages and historical documents in general
2. Inclusion and Accessibility – Avatars proficient in sign languages, high precision caption generation etc.,
could increase accessibility for all people
3. Personalized Content Generation – Video games, music, movies can be created that cater to users and
individual interests at scale
4. AI Tutors – Imagine a world where you can conjure up a tutor to teach you any skill you would like to learn at
your own pace
5. Intelligent Assistants – Laborious and repetitive tasks can be delegated to Intelligent Assistants allowing
humans to focus on critical thinking and decision making
6. Accelerating Scientific Discovery- General advances in AI can help accelerate scientific discovery by
generating deep insights from massive datasets and design new algorithms. This can help solve most
challenging problems we face today.
For Wifi use Greenline
34. AI Center of Excellence @ Fidelity
q [arXiv’23] Explainable AI with Booleans BoolXAI https://github.com/fidelity/boolxai
q [NeurIPS’22, CPAIOR’23] NER for Optimization Ner4Opt https://github.com/skadio/ner4opt
q [IJAIT’21] Recommender Systems Mab2Rec https://github.com/fidelity/mab2rec
q [AAAI’21] NLP/Text Featurization TextWiser https://github.com/fidelity/textwiser
q [ICTAI’20] Multi-Armed Bandits MABWiser https://github.com/fidelity/mabwiser
q [AI Magazine’23, AAAI’22] Sequential Mining Seq2Pat https://github.com/fidelity/seq2pat
q [CPAIOR’22] Feature
Selection Selective https://github.com/fidelity/selective
q [ICMLA’21] Fairness & Bias Mitigation Jurity https://github.com/fidelity/jurity
Research & Open-Source Software
github/fidelity
https://jobs.fidelity.com/
For Wifi use Greenline