Non-human travel agents are here. Virgin Atlantic earlier this month installed an AI travel agent on its website, calling the web-bound chatbot “the future of travel planning.”
The project, dubbed Virgin Atlantic Concierge, was implemented by Tomoro, a UK-based AI design consultancy, in partnership with OpenAI.
The Concierge service supports typed or spoken input, assuming the Virgin Atlantic site visitor has granted permission for device microphone activation. The bot, which can respond using simulated speech as well as text, takes longer processing spoken input than typed input, presumably because it has to process captured audio using a speech-to-text model.
Sam Netherwood, co-founder and director of Tomoro, spoke to The Register about the development of Concierge and the project’s goals.
“Initially, Virgin engaged us,” Netherwood said. “We partner quite closely with OpenAI. Virgin also had already entered into an agreement with OpenAI. And so that’s how it came together in the first place. And initially we just did a proof-of-concept over about six weeks.”
The goal, he explained, was to test some hypotheses about whether the technology could deliver an AI interaction experience that would be valuable and to explore how that experience might feel from a consumer perspective in light of Virgin Atlantic’s ambitions.
“So the proof-of-concept that we built was using OpenAI’s real-time speech-to-speech model,” said Netherwood. “That’s built for ultra-low latency, really natural, human-feeling conversation. But there are still some question marks around the intelligence of that model.”
Concierge is an agent, meaning that it’s an AI model given access to tools and APIs. As Netherwood observed, the agent doesn’t work if it can’t access Virgin Atlantic’s APIs to fetch flight information and book travel – it wouldn’t be useful if it just dealt with a subset of pre-canned routes or certain holidays. The agent also needs to be able to determine the user’s intent, in order to properly differentiate between support requests, travel planning, and other scenarios.
“So instruction following has to be really good,” he said. “Tool calling has to be really good. And actually in the proof-of-concept we were like, ‘I don’t think we’re quite there yet.'”
That was when OpenAI’s Realtime API was in beta testing. In August, the Realtime API reached general availability and the project advanced toward its public release.
“Now we have a better version,” Netherwood said. “I think we’re probably not a million miles away now from us having the tool-calling accuracy and the instruction following-accuracy to enhance the experience with a much more natural conversational exchange, which really was like one of the fundamental things that Virgin Atlantic challenged us to do.”
New infrastructure was not required. Netherwood said that while edge computing holds a lot of promise for being able to extend Concierge functionality to interactions when boarding an airplane, for example, the focus of the initial release has been on the web portal.
“At the moment,” Netherwood said, “we’re only embedded in Virgin Atlantic’s website, not behind any authentication and the reason for that is just that their mobile app is coming – they’ve built a new mobile app from the ground up. I think once the Concierge is connected to an account, to a person’s user ID, it becomes much more valuable because the itinerary that the agent creates can persist over time and can be added to.”
Presently, Netherwood explained, the agent can access Virgin Atlantic’s flight and holiday APIs, enabling it to find and book flights and handle holiday planning. It can answer questions about the company’s Flying Club loyalty program and basic support in the interest of call center deflection – discouraging calls to call centers.
Netherwood couldn’t provide a breakdown of the kinds of requests being made, but said a lot of people just want basic support questions, about baggage allowances and the like, answered. For unplanned events, like airport disruptions, the agent will fall back to the website when it doesn’t have relevant information.
The Concierge in its present form also stops short of completing travel transactions – it deep links to the Virgin Atlantic website to handle credit card purchases. Netherwood, however, said that will change in the mobile app.
Safety, Netherwood said, has been top of mind.
“A huge part actually of this MVP [minimum viable product], a huge consideration, was safety because we’re not behind a login or authentication,” he said. “Anybody can come to this thing and use it. So malicious behavior, prompt injection, jailbreaking, all of those things become a real big concern.”
He said Tomoro has a lot of experience with guardrails and has a robust multi-stage system for making these kinds of solutions safe, so there wasn’t any need to do anything particularly novel in terms of safety.
“The interesting things from that perspective, I think, are actually conversational design and how easy it can be to break and absolutely destroy the illusion of intelligence in these systems,” he said.
As an example, he pointed to the Concierge’s off-topic guardrails, designed to keep conversations relevant to travel. Unless you’ve been careful about how you design the conversational space by providing context and conversation history, he said, natural interactions like people saying “thank you,” or “no” in response to a question, may confuse the system.
“In the north of England, people use the word ‘ta’ to mean thank you,” Netherword said. “We did have a couple of occasions where somebody actually said ‘ta’ to the agent and it thought they were speaking Swedish. So it replied in Swedish, ‘hey, uh sorry, only English language for now.'”
Netherwood said his team did a huge amount of work trying to balance safety with making sure that the agent can handle natural conversation that changes between different intents.
As to the expense of operating the bot, Netherwood insisted that the API calls and backend infrastructure are not that costly. “In the future we will be exploring reasoning models, particularly in itinerary building when those itineraries get complex and where we can afford latency in the conversation,” he said. “But right now [the cost] is pretty trivial.”
Reasoning models typically cost more to run than their non-reasoning relatives because they process more tokens over time. But that may be necessary to help Concierge and other agents consider the requirements of complex itineraries, like distances between locations and travel times.
Netherwood said that you have to be very careful when building a complex itinerary to ensure that the agent has enough situational awareness. Otherwise, he posited, you might get a situation where you ask the agent to book a restaurant reservation and it requires a 75 minute cab ride, despite the fact that you’re already booked to visit a museum 30 minutes before the reservation time.
“The models are capable of [itinerary planning],” he said. “I think they’re going to be really capable of that now that we have more effective reasoning models.”
So far, Netherwood said, the launch of Concierge has gone well. “We haven’t had jailbreaks,” he said. “We haven’t had all of the the nightmare things that you would not want to happen.”
Educating customers: This is not your usual bot
One of the next steps beyond improving the AI agent itself involves educating people on how to communicate in natural language with agents.
Consumers, Netherwood said, are used to interacting with chatbots that follow a programmatic tree-based structure – banking support bots that predated generative AI, for example. So they often fail to engage with agents in a way that takes advantage of their capabilities.
“So for example, you can have a really nice conversation with the agent,” he explained. “You can say, ‘I want to go to New York in March next year, long weekend for our anniversary. It would be fantastic if we could stay somewhere like 20 minutes walking distance from Central Park, but not too expensive in terms of the hotel. We’d like to spend the money on maybe flying upper class,’ and [the agent] will go find you that experience. But of course that’s not necessarily intuitive when people are used to going into these systems and saying, ‘holiday now’ or ‘speak to agent now.'”
(Sometimes, people just want to speak to another human.)
Netherwood said his team has been exploring ways to help people communicate more effectively with AI models and he believes that real-time speech eventually will prove really helpful for these sorts of interactions.
“We just have to be in the place where we can make sure that the tool-calling and instruction-following accuracy is there,” he said, “because the last thing you want is a lovely, natural feeling conversation with an agent and then 50 percent of the time it can’t go find a flight or there’s there’s an error.” ®