On this episode of the AI For All Podcast, Ismaen Aboubakare, Head of Developer Advocacy at Airkit.ai, joins Ryan Chacon and Neil Sahota to discuss prompt engineering for businesses. They talk about what makes a good prompt, the role of prompt engineers, prompt engineering versus high level programming languages, domain expertise, who hires prompt engineers, getting started with prompt engineering, open source LLMs, and how businesses can start adopting AI.
About Ismaen Aboubakare
Ismaen Aboubakare is the Head of Developer Advocacy at Airkit.ai. Prior to joining Airkit, Ismaen worked at Microsoft as a Power Platform Customer Engineer and a Consultant. He worked as a software engineer at GE Capital. Ismaen also has several certifications from Microsoft, including Worldwide Communities - Community SME 2020, Accessibility in Action, Architecting Microsoft Azure Solutions, 70-534, and Microsoft 365 Certified: Security Administrator Associate.
Interested in connecting with Ismaen? Reach out on LinkedIn!
About Airkit.ai
Airkit.ai, which was acquired by Salesforce, helps brands automate and scale their customer service capabilities using the latest AI agent capabilities powered by generative AI (GPT-4, specifically). Whereas most legacy ‘AI-powered’ tools have operated as little more than generic FAQ respondents, businesses can now extend their customer service teams’ reach by incorporating AI agents that can do everything from sending a replacement product and issuing a refund to changing an address and rescheduling an appointment.
Key Questions and Topics from This Episode:
(00:29) Introduction to Ismaen Aboubakare and Airkit.ai
(00:43) What is a prompt?
(02:42) What is a prompt engineer?
(05:07) Prompt engineering vs high level programming languages
(08:17) The importance of parameters
(09:43) The value of domain expertise
(11:50) Who hires prompt engineers?
(14:27) Getting started with prompt engineering
(17:40) Is ChatGPT getting worse?
(18:50) Open source LLMs
(21:34) Best open source LLM?
(23:48) How can businesses start adopting AI?
(28:56) Handling privacy
(30:55) Learn more about Airkit.ai
(00:43) What is a prompt?
(02:42) What is a prompt engineer?
(05:07) Prompt engineering vs high level programming languages
(08:17) The importance of parameters
(09:43) The value of domain expertise
(11:50) Who hires prompt engineers?
(14:27) Getting started with prompt engineering
(17:40) Is ChatGPT getting worse?
(18:50) Open source LLMs
(21:34) Best open source LLM?
(23:48) How can businesses start adopting AI?
(28:56) Handling privacy
(30:55) Learn more about Airkit.ai
Transcript:
- [Ryan] Welcome everybody to another episode of the AI For All Podcast. I'm Ryan Chacon, here with my co-host Neil Sahota, the AI Advisor to the UN and founder of AI for Good. Neil, how's it going?
- [Neil] Yeah, I'm doing all right. Just another day in paradise.
- [Ryan] We have Nikolai, our producer here as well.
- [Nikolai] Hello.
- [Ryan] All right. So before I introduce our guests today our focus of this conversation is going to be talking about the importance of prompts in AI, the role of a prompt engineer, the evolution of open source LLMs and their role in businesses.
And to discuss this, we have Ismaen Aboubakare, the Head of Developer Advocacy at Airkit.ai. They are a company that helps brands automate and scale their customer service capabilities using AI. Ismaen, nice to have you here.
- [Ismaen] Yeah. Nice to meet you all. Thanks for having me.
- [Ryan] So let's kick this off. Let's talk about just what the importance of a prompt is. And when we say prompt, what do we mean by that in the AI space? I know a lot of us who have been playing around with ChatGPT and other AI tools probably have an idea, but there might be some listeners who are, just to get them caught up, when we're talking about a prompt, what does that mean? What is the importance and why does it matter?
- [Ismaen] A prompt is what you use to essentially interact with a model or an LLM, right? I think Andrej Karpathy, I don't know if I'm pronouncing that name right, but he actually tweeted, I think it was in January, like that the next programming language is going to be like natural language or English, if you will, which is essentially instructing an LLM or a model to do something or get an output based off of what you're intending it to do. So I find like prompts are really important because there's different types of prompting techniques as you're trying to engineer prompts to get a better output, such as doing few shot prompting, which is like giving it examples based off of what you're trying to, what you're trying to get like output out of the LLM. Or maybe a chain of thought, which is like doing, having the LLM think in a series of steps. And so prompts are just essentially the tool that you have to interact with a model.
- [Ryan] You're out there listening and you've used ChatGPT when you type into the text box anything that you're asking ChatGPT to look at, deal with, do, that's the prompt, right? And that kind of carries over to if you talk to a chat bot and you've typed in anything into the chat bot that is a prompt to then interact with that data, that model in order to get a response or have it do what you're hoping for, which then brings us to my next question, which is who's responsible for building the functionality for prompts to be usable or can be worked around for?
So this is I guess more talking about like a prompt engineer, right? So what is that role? It's a new role. It's definitely something that I have not heard too much about. Neil might be more familiar with it, but just generally speaking, talk to us about what a prompt engineer's role is and why it's important with all the stuff that we have, see going on in the AI space.
- [Ismaen] A prompt engineer, like you said, is all the buzz, right, in the last few months as a part of what people are calling it like the ChatGPT revolution, right? Everything, the hype around AI. I think it is, at least, I think there's a lot of different definitions for it. My personal opinion and definition of it is someone that can successfully evaluate, apply, and productize AI in a business or in their particular use case.
And what that means is to understand one, how like the different problems or the different use cases that you can potentially have in ingraining and embedding AI into your company, whether it's like optimizing your workflow, if it's using AI to get better analytics across some of your data.
Or is it even using AI to interact with your customers on behalf of yourself? It's understanding what models to use, how to use those models, how to prompt your models and implement, implementing the AI in a I guess like a better experience for whoever's using it. And so I think there's a lot that kind of goes into being a prompt engineer, and I think we're all still figuring it out.
A lot of, I think I saw like my first, like one of the first, prompt engineer like hiring positions a few months back. And it's still like an unknown space, if you will, but people are trying to, companies are trying to embed AI into their workflows and so getting people that are familiar with how to do that, what are the different use cases, what kind of tools are available in a one secure and compliant and useful manner for their end users is really important.
- [Neil] I'm going to play ignorant here for just a moment, right? Because maybe that'll help clarify a couple of things for our audience. It sounds like we're moving back to COBOL because COBOL was a programming language based on natural language.
You'd say add A to B. I know this is not the same, right? What's the difference between the prompts we're using with generative AI to something like COBOL?
- [Ismaen] The first thing that comes to mind here is that with a programming language, it is deterministic, which means that there is whatever you put into the program or the computer, it will compile that and compute that, and if you like, if you do one plus one, it'll always equal two. If you add one plus one in COBOL, it'll always equal two. If you provide a model such as let's say ChatGPT or OpenAI with a particular prompt, and there's different facets of a model that you can change, such as there's a property called like temperature, which increases or decreases the entropy or the randomness of the output there. But if you add a prompt to a model, it may or may not always spit back or return back the exact same answer. And so I think it's like that's, that I, at least that was my first initial impression of what the difference is, difference between engineering prompts versus engineering like through a programming language, is understanding that models are non deterministic, but there are ways to engineer and stylize prompts so that you can actually retrieve an output that is more aligned with what you're looking for, which is, which can be done through let's say fine tuning, which can be done through different ways to, how would you call it, stylize your prompts. I've actually encountered, this is just through my own personal testing and seeing on how other people have built out prompts, but like writing prompts like from like paragraphs versus writing prompts in a specific language. So there's like markdown language that you can write prompts in so that it can, so that it's like there's more structure to your prompt or even writing prompts with JSON structure, JavaScript Object Notation. Like figuring out how to structure your prompts so that your model can understand it better, your model, you reduce the amount of tokens that your prompts and your LLMs actually use to decrease like what we call like the context window or the memory. I don't know if we want to go down that rabbit hole, but at a high level, models have a context window, which is a memory, which means like, how much data can you provide it in order for it to, like within your session of a prompt, or sorry, your session with an interacting with the model?
- [Neil] Maybe to simplify that a bit, we could say that it's about parameters, right? We're trying to add, the structure you're talking about is really around parameters. So like my friend, LJ Rich, very famous musician, she actually uses generative AI to compose music now. And so her prompts are like 500 words long. She's talking about I want to use these four tones. I want to use this harmony. I want to use this melody. I want this kind of jazzy beat. I want a reggae thing. So, she's looking at the structure of lyrics and the actual music because she understands what forms that. And I think this is an area where I think a lot of businesses struggle in that they think, okay, I'm going to go to ChatGPT and say I need to write an award winning speech on X. And that's not a sufficient enough prompt, right? It's not enough parameters, I should say, to get a good draft of that. I think everyone seems to be looking like well, I got to hire some prompt engineers.
They're turning to a lot of technologists, they may not know the space. I don't know many technologists that understand music to the depth that LJ does, and I think the real thing I think we're trying to get to, it sounds like, what you're actually doing Ismaen is that how do we put this power in the hands of the business people that actually understand the domain, understands the parameters.
- [Ismaen] I think the term like prompt engineering is a scary word. In my opinion, it can imply like technical, it can imply coding, it can imply programming, etc. It's, this field is so fascinating because I feel like people from wide ranges of backgrounds, whether you're technical, you're non technical, whether you're a musician or an artist, you can use natural language to take advantage of these models and being a domain, like an expert in your domain and understanding the notes or tonality or however you want that output to be can actually be more effective being a domain expert than sometimes just understanding how to use the model itself. And I think at the end of the day, there's, the ultimate way to learn how to prompt is to just spend time using these models. Like I've, you can see all the different models that are being released from OpenAI, such as like GPT-3.5, GPT-4, Anthropic Claude 2. There's no documentation on how do we use this? There's documentation on how to make a request to the model, but how do you get the best output? What are the best practices? And the provider, the companies that are actually providing these, that are building out these models, they, I think they're still figuring out what the full potential of these models are. And I think jumping off of your point is domain expertise sometimes it can trump like just understanding techniques as well.
- [Ryan] Let me ask you when it comes to what we're talking about, I know, I like how you address the point of when we say engineer, what does that mean? Because to some, it's a very scary technical term. What businesses are really going to be focused on hiring prompt engineers? And what kind of skills are required to be a prompt engineer? Is it going to be the companies that are building these LLMs and building these tools and that are going to be adopted by other companies or is it and is it potentially also the companies that are going to be bringing these tools into their business and needing somebody that can help the company be more efficient by understanding what kind of prompts work well with a certain model for a tool that they have in order to be efficient with that tool, to help their business do whatever they're trying to do with it.
- [Ismaen] I think I foresee it actually being the latter. The businesses that are looking to embed artificial intelligence or AI into their workflows, the businesses that are looking to automate specific interactions or even just build or like a 10X some of their productivity. I think that those are the type of businesses that are going to be hiring these prompt engineers.
And then possibly even, there's this like, how would you say it, there's this kind of wave or with OpenAI and that, the AI revolution that we're living in right now, being, leveraging AI is so different, it is completely different than it was, let's say, five, 10 years ago with being a machine learning engineer because I think anyone from any background that you're like one, if you are technical, then you can leverege, you can use an API and developers or like the developer experience around using some of these models is actually a lot more simple. You don't have to train a model on your data. There's fine tuning APIs where you upload like a CSV of your data that you can have a model trained on your specific, let's say, company data.
So even if you have a developer that, they can essentially take on the role of being a prompt engineer. I'll say that businesses that are looking to embed AI into their particular workflows, will be looking, will probably be hiring for prompt engineers, but also I would encourage them to even look internally and seek out the developers that are interested in AI because I think the people that, those are the people that can also harness and leverage it, and they have the domain expertise of your company.
- [Neil] Would you recommend on how people should then actually get started? If we're cutting down this path, what's the next step for them?
- [Ismaen] I think that the best way to get started or at least like the way that, there's a lot of online content out there.
Specifically, I think it's one, now I can't remember the name, I think it's like deep learning or something like that, deeplearning.ai. There's a lot of content out there that kind of go over the basics of building out prompts. Understanding what different prompt techniques that you can use and using that as your foundation.
But really, I would say, just think, figure out like how you want to, figure out how you want to start using AI into your workflows. If you're a marketing, let's say a marketing person, you're trying to build a lot of blogs or write social media posts on behalf of yourself, give it some topics that you'd like for it to tweet out or to write LinkedIn posts about. Add some more description around it. Give it some examples of what you're trying to look, and when I say it, I'm talking about the model or a ChatGPT or something like that, but give it some examples. See what the output is. Try to continue to iterate on your prompts to get the output that you're looking for. One big thing that I've learned as I'm, as I've been building out like essentially LLM apps or AI apps is that prompt engineering is an iterative process.
It's completely different than what I'm used to actually, which is in the past is essentially building out code or building out apps that I know exactly what it's going to be doing at if then else conditional statements, right? But when you're building out prompts, sometimes it's not doing exactly what you're doing because your prompt isn't exactly the right output or isn't written exactly how you want your LLM to interact.
And one is get a foundation with specific prompt techniques, understand some of the terminology. Two is interact with the model. So just play around with ChatGPT. I would also even encourage users to pay for ChatGPT Plus. Not sponsored or anything by them or anything, but it's, I find it actually very useful to play with the plugins and play with some of the, I think they just announced like custom instructions, which is the ability to add more detail or description to how your, how ChatGPT will respond to you, so you can give it like a style or a tone. And then understand that it's iterative, understand that prompt engineering is iterative. It's gonna, you're going to go through these cycles and you're gonna but once you get it's very rewarding.
- [Nikolai] Something I wanted to ask. There's a lot of people who have spent a really long time with ChatGPT, just prompting it and talking to it, and there's been claims or rumors around that they have observed that ChatGPT has gotten worse. A lot of people started investigating this to see if it was true. What are your intuitions about that and why that would even happen?
- [Ismaen] So, I don't really have an authoritative answer on this. I'll give you my intuition. OpenAI claimed that there were no changes to the model based off of kind of what was said through, I follow their Head of Dev Rel over at OpenAI and said that they don't have any changes.
I actually suspect that people's expectations have just increased in a way as people continually use the models and with an increase in expectations, you fall into, I forget like which fallacy it is, but you fall into this trap of thinking that the model has, is underperforming based off of how it did before. To really answer your question, I have no idea, but that's just my thought there.
- [Neil] We can always give Sam a call, see if he can give us the insight.
- [Ryan] Let me ask you just regarding LLMs in general, how have you seen them evolve over recent years? Like just especially the open, on the open source side, like what does the evolution look like? What does that mean for the industry? And since we try to focus our conversations back to businesses and enterprises looking to adopt and understand AI better, what should people be thinking about when they hear about open source LLMs and the benefits and values it can provide for a business?
- [Ismaen] So around open source LLMs, it's honestly been evolving I think week by week, day by day. I think the most newsworthy open source LLM that just got released was like Llama 2 by Meta. Some say that it is, it's as accurate or as powerful as like GPT-3.5, I think. But the, with like that evolution of open source LLMs, what, when a large company such as Meta releases an open source LLM, what it does is it actually spins off a bunch, like a ton of different other open source LLMs to create better models. And so one thing that you get with working with those is one, you get to understand like the inner workings of the model. With a close source model like OpenAI, don't really understand like, you understand maybe like how many parameters it's trained on, and you understand like the different rate limits and all that. But with an open source LLM, you get to understand the inner workings, the functionality. You're able to contribute to it in an open source nature.
And you're also able to bring it into and have full control when it comes to your own business or the enterprise. And you get to maintain the performance of the LLM. You get to ensure or have full control over like the data privacy of that model. So if you want to, let's say, take Llama 2 and fine tune it to your specific needs without the, I guess like without the additional kind of like costs that it takes to fine tune a closed source model, you can do that. And so I think the biggest thing is around control, around I think data privacy is a big one, control around cost and then control around like performance. Yeah, I think those are some of the things that you want to keep in mind when working with open source LLMs.
- [Neil] So if you had to pick one, who would you go with?
- [Ismaen] Probably Llama 2. It was trained off of like 70 billion parameters. What Meta is doing is actually really cool around open source LLMs is they have different models for different, I guess, tasks, right? Like Llama 2 is actually not very good with coding capabilities, but they're actually, I think there was some news around them releasing a code, I think, they call it like Code Llama or something. One that's actually specific for, one specific for coding. They just released like Seamless-M4T, which is a multimodal model for using like around different languages like speech to text, text to speech. And so I think I would probably use one of Meta's models just because of the the community behind it, the, and I guess like the resources that they have
- [Neil] And how long do you think Llama 2 will have that advantage?
- [Ismaen] I actually have no idea just because I feel like every week there's there's a new open source LLM. Right now on top like, so there's this website called Hugging Face where they, and they have an open LLM leaderboard. Right now, I think the, actually looked it up before this, it was Platypus, I think it's what they called the model, but it's a merged model between a model called Platypus and on top of Llama. And so they took these two models and made it into one and right now that's the leading one. I haven't tried it out or anything like that. There's an open source project also called God Mode that is, allows you, it's like a desktop app that you can type in a prompt, and it opens up five different windows of different models all in one. So, you can actually try out your same prompt in ChatGPT, in Google Bard, and I think Claude 2 and a bunch of other like other open source LLMs to compare them. If you are interested in looking at other models, how they perform, would totally recommend that.
- [Ryan] How do companies listening to this just make a decision on how to even get started with adopting AI tools and LLMs and just bringing this into their business, whether it's creating an experience for their customers or just internally. There's a lot of stuff we're talking about here which may be over the heads of some of the people just because they don't pay attention to the AI space as much. They're just looking for something to do a certain job and help their business move forward or compete with their competitors. So how do you, how do you think people or companies that are looking into adopting AI solutions or bringing a LLM into their business should get started down that road and not feel overwhelmed? What's the approach or advice you have for them?
- [Ismaen] I would start with low risk and probably like text based use cases or either text based or like database use cases. I say low risk because I think there's multiple use cases when it comes to building out, using, leveraging AI within your business.
And some is working with, let's say, I don't know, data that you don't want to be leaked, or you don't want to, you don't want that to be in front of a customer or anything like that. So, I would say like probably start with like internal use cases such as, hey, I have this blog post, or I have a marketing team. They need to post on social media every day. They need to put out a blog post every day. They need to actually make sure that blog post aligns with these SEO like keywords. And like leveraging, even leveraging just ChatGPT for some of those tasks can be actually really effective. You can pretty much get an entire blog post written for your company.
You can even, if you're, for example, if you're like hosting a podcast, you can have a list of topics and have them ask questions, like create questions for you for that podcast. If you have an audio file from Zoom or even from that podcast, you can put that into ChatGPT and have it transcribe it for you and create a blog post out of that as well. Using it for internal use cases that are text based is, would be my recommendation just because the learning curve is not as high. And then going a little bit further is there's this concept called like AI agents, which is attaching tools to an LLM as it's, I guess, as its brain, if you will. An example of that is like what we're doing here at Airkit.ai is we're building out AI agents for customer experience. And pretty much any interaction, any like I guess a support interaction that you have with a brand, we want to essentially for tasks that are, don't need like an actual human such as tracking where's my order or maybe issuing a refund. All those are attached to a system, right? So you have a CRM such as like Zendesk or something like that. An LLM, like you can have your model talk to Zendesk and given a user's phone number or maybe an order number, get the information straight from Zendesk, right? A model doesn't have access to the internet. You have to give it access. It doesn't have access to these tools. You have to provide it these, you have to give it access. And so allowing your model to or giving your model access to these tools can then interact with a customer on your behalf. So issuing refunds, knowledge based questions, right? So it's like, oh, I have an issue with my mouse or something like that that someone could probably figure out through a quick Google search, but they went straight to support, the support email, and you can help deflect calls or deflect support requests in a one, graceful manner. People get their questions faster. You don't have to wait 24 hours for a response from your agent. I think the next step out of these low risk or text based use cases is actually giving your model access to your tools, to your systems, and you also have to provide it guardrails as well. But giving them access to those tools so that they can perform tasks on behalf of you.
- [Ryan] How do you handle like privacy with that kind of thing?
- [Ismaen] That's always a question that comes up, and there are actually mechanisms that you can build within I guess like within our platform, or if you're building this on your own, to one, not, one, is require a form of authentication before actually having the model retrieve data, right?
You want to, and one way that we've done this is actually like two factor authentication. Requiring a user to send an SMS or sorry, requiring, having the system send an SMS message to that user, such as they're saying, hey, this is going to the wrong delivery address. This is, I actually moved, and I want to, I want to actually send it to my apartment or whatever. That's considered personal information. It's an address based off of a particular order number or based off of a particular email. So, we'll have the mechanisms in place to send them an SMS, say, hey, like this is personal information. Cano you, we just sent you a text message because we looked up based off of your order number, based off of your email address, you have a phone number on hand. We just sent you a text message with a six digit code. Can you reply back with that six digit code? If it's correct then you will continue with your workflow. If it's incorrect, then that can, that'll essentially flag to the system saying, hey, like this person could be doing something malicious. This person can be trying to do, they call it like prompt injection or like hijack, like jailbreaking the model, let's flag this, let's close the conversation down. There are mechanisms that you can build and that we've built into our platform to allow you to one, to have that layer of privacy and security.
- [Neil] Well, I think while we're waiting for DeepMind to release Gemini, Ismaen, what's the best way for the people to learn more about Airkit and the work that you're doing and the space as a whole?
- [Ismaen] If you want to learn more about Airkit and what we're doing, you can go to airkit.ai and check out what we're doing in the AI agents space across e-commerce. And so we're building out AI agents for customer experience and customer support, and we're helping customers across different industries. We recently actually just had a mid sized customer in the retail industry use our AGI agents to reduce like inquiry times and improve resolution rates. And so you can go there to learn more about us. You can also follow me on Twitter @ismaen_ to follow what we're doing.
- [Ryan] When do you think like people will start calling it X? Do you think it'll just always be, default to Twitter? Because I still see like the icons pop up and people still use the Twitter icon because it's obviously recognizable, right? But yeah, thank you so much for taking the time. It was really cool.
It was nice to talk about some topics we haven't covered. We've talked about chatbots in the past but being able to talk about the importance of prompts, prompt engineers as a new role that people are just going to start seeing and needing to look into as well as AI agents, which I think is a really interesting kind of way when, you know, outside of that in the software world, we talk about APIs, talk about plugins and things to be able to bring data in, but how does that work and really apply to the AI space? So those agents are something that are going to play a huge role. So, it's really cool what you have going on and thanks for coming on to share insights with us.
- [Ismaen] Thanks for having me.
Special Guest
Ismaen Aboubakare
- Head of Developer Advocacy, Airkit.ai
Hosted By
AI For All
Subscribe to Our Podcast
YouTube
Apple Podcasts
Google Podcasts
Spotify
Amazon Music
Overcast