Context & questions

Our client wanted to build a data platform to deliver on multiple use cases like predictive maintenance, customization or price optimization. They faced many architecture challenges and had many questions :

  • How can we make the best use of the different data sources we have including the cloud ?
  • Should we centralize data in a structured data warehouse?
  • Should we add an additional layer on top?
  • What level of cleaning and transformation should be performed?
  • What level of involvement and autonomy should the business have in these tasks ?

our adviser

For this consultation, we picked Jean Michel C. , Data Expert for Singapore Airlines.  

Our answer

During the consultation, JeanMichel made the following recommendations :

1. Make a clear distinction between Time Sensitive Data ( Kafka, Event Hub, Serverless Data Hub) and non-Time Sensitive Data ( Batch, Data dump on cloud object storage)

2. Do not store data in a centralized Data Warehouse for the following reasons :

  • Require nearly permanent compute and unique design before being used
  • Complex data pipelines to build to feed a rigid schema
  • Difficult to implement across domains (métiers)

3. Adopt a Lake House approach ( Technical approach) :

  • Data Lake with query capabilities
  • Some data warehouse to support only “warm” data for reporting purpose
  • All “cold data” stay in the data lake and can be queried directly with no data warehouse requirement

1. Let IT decide data quality
• Missing fields can be ok on a business perspective
• Data quality is the first step to check before making data available

b. Let IT drive a data project without a data owner (from the business)

• EACH data project / stream needs to have a Business RoleJean Michel also explained to avoid the following pitfalls :

1. Collect all the available data … just in case
• Expensive
• Not useful as it requires downstream management with no oversight from any data owner since the use cases haven’t been defined
1. Let IT decide data quality
• Missing fields can be ok on a business perspective
• Data quality is the first step to check before making data available

1. Let IT drive a data project without a data owner (from the business)
• EACH data project / stream needs to have a Business Role

Our answer

Based on our recommendations, Poclain was able to write a Request for Proposal saving weeks of work. They understood also who were the providers to work with in priority. We help them save time & money to build their data platform.

Nous avons aussi accompagne

Our website uses cookies

We and our digital partners use cookies to improve your browsing experience, save your preferences and provide us with information on how you use our website. For more information about cookies, please see our Privacy Policy.