Trio Data Engine
Taking XML data from multiple sources, and preparing it for downstream analysis by multiple BI applications including the popular Elastic and Kibana
On-line travel companies generate floods of data and turning this data into insights that can deliver value through business intelligence (BI) is imperative. Evidence points to many of the challenges and costs involved in building an effective BI system are not in the analysis itself, but in the complexity of developing the means to capture, process and surface the required data in the suitable shape and format.
This challenge is at its greatest with structured data such as XML and JSON APIs. These transactions contain hundreds of elements and often in complex list forms. By its transactional nature, API data needs to be real-time to be actionable and deliver value. To be able to thrive, travel businesses need to put in place the robust data foundations that will enable their BI capability to build the prescriptive analytics and AI systems of tomorrow.
The Business Need
Most enterprises have systems that generate massive amounts of data, often created and stored on different systems and in different formats, which can make analysing it effectively very difficult. Think of the transaction data produced by Central Reservation Systems (CRS) or Passenger Service Systems (PSS) much of it in XML or JSON formats. The sheer quantity, scope and complexity of this data challenges many well-known data ingression tools that have been designed with the more usual log files in in mind. Raw XML/JSON data often needs careful sanitising and marrying and processing to make it ‘fit’ for analysis and reporting. This is where the Trio Engine comes in. The perfect processing data hub for XML/JSON formatted data.
Business Intelligence in travel is all about going deep into the data. To do this many organisations are using BI technologies such as the open source Elasticsearch Logstash Kibana (ELK) stack among others, but are facing challenges in getting their data into the right state in a timely manner. We can help you address your limitations.
Trio Data Engine - What it is
With our Trio Data Engine, we are combining our scalable platform and data enrichment expertise to help travel providers get the most out of their data using these functionally-rich BI environments. We have used our extensive experience of analysing XML/JSON message streams running through online travel provider APIs, to evolve our platform into a versatile and highly scalable engine that can collect raw XML/JSON data from a variety of sources and transform it into ‘fit for purpose’ data to feed a variety of data consuming systems.
Trio Data Engine acts as a processing hub for a variety of functions such as cleansing, extraction, translation, aggregation, retention and analysis. The prepared data can then be fed into a variety of other BI systems for further analysis and consumer presentation to extract insights for business. Alternatively, the Trio Engine’s own analytics and presentation layer can be used for this purpose.
The consuming systems can be either:
- The BI applications preferred by an organisation’s business analysts
- Other automated systems that depend on prepared data feeds as part of their analytic processing
The Trio Data Engine excels in high volume real-time XML/JSON data collection and preparation that can feed your need for clean source data for your BI or AI engine. Only clean and timely source data will give you the visibility into your business that you can trust.
Trio Data Engine - How it works
The diagram shows the broad flow from data collection, the value added processes and the downstream BI systems. The systems shown are not exhaustive, but representative.
Want to know more?
Who we are and what we do
15 years experience in monitoring critical high performance systems for blue chips. Since 2012 perfecting the Trio platform capabilities: of collecting, cleansing, preparing, storing and analysing XML/JSON data in near-real time for B2B travel searches and booking traffic. SaaS or en-premise versions to meet the demands of speed, accuracy and scalability for major distributors such as Hotelbeds, Farelogix and Bonotel.
Our systems process more than 3 billion transactions per day.
In travel, search requests and replies include raw XML data flowing between an organisation and its clients through the organisation's API(s) which are in turn connected to its central reservation system or booking engine. The Trio Engine is able to capture that raw data (unobtrusively and without impact on servers or networks) from a variety of cloud or on-premise sources including networks, cloud based object stores or big data message queues.
Data cleansing is the first step in the overall data preparation process and involves identifying and correcting messy, raw data, such as errors and anomalies. Data can come from a number of disparate sources, which all need to be cleaned into a consistent unified format for easier consumption and to ensure it all adheres to the same shape and schema.
- Extraction and Translation
Using business rules or lookup tables, the data is extracted and blended from different data sources into a homogeneous format. The data is typically transactional in nature and needs to be merged with the more static data that defines products, clients and other business entities. Triometric understands travel data and working with your team we can define the optimised and meaningful meta data descriptions and data relationships that will give you the right data you need for further downstream analysis. At this stage data can be enriched as required to produce additional data sets that make it easier for your downstream processes.
Data aggregation is any process in which information is expressed in a summary form. Ineffective data aggregation can limit query performance. BI reporting tools tend to be weaker at handling large volumes of raw transactions, so most reporting is based on aggregated information. A quick dive into the data will show why getting aggregation and data enrichment right can generate significant performance benefits, with inherent improvements in analysis and reporting capabilities.
Whilst the emphasis is to feed data to decision making systems, Trio Data Engine has the inherent capability to manage the retention of the data on a shorter or longer term basis. For example, if the output system is focused on dynamic pricing, it's ability to store large scale search data (demand predictive) will likely be limited but might benefit from trend type metrics requiring data storage e.g. monthly average or gradients.
BI isn’t generic. So your BI initiative should be supported by the right toolset that’s tailored to fit your data.