Cold Path Analytics

Azure Cosmos DB – TTL (Time to Live) – Reference Usecase

October 9, 2018 .NET, .NET Core, .NET Framework, Analytics, Architecture, Azure, Azure, Azure Cosmos DB, Azure Functions, Azure IoT Suite, Cloud Computing, Cold Path Analytics, CosmosDB, Emerging Technologies, Hot Path Analytics, Intelligent Cloud, Intelligent Edge, IoT Edge, IoT Hub, Microsoft, Realtime Analytics, Visual Studio 2017, VisualStudio, VS2017, Windows No comments

TTL capability within Azure Cosmos DB is a live saver, as it would take necessary steps to purge redudent data based on the configurations you may. 

Let us think in terms of an Industrial IoT scenario, devices can produce vast amounts of telemetry information, logs and user session information that is only useful until we operate on them and take action on them, to be specific up to finate period of time. Once that data becomes surplus, we need an application logic that purges these old records.

With the β€œTime to Live” or TTL, Microsoft Cosmos DB provides an ability to have your documents automatically purged from database storage after a certian period if time(which you configured)

  • This TTL by default can be set on a document collection level and later can be overridden on a per document basis.
  • Once the TTL is set, Cosmos DB service will automatically remove the documents when its lifetime is over.
  • Inorder to track TTL, Cosmos DB uses an offset field to check when it was last modified.  This field is identifiable as β€œ_ts”, which exists in every document you create.  Basically it is a UNIX epoch timestamp. This field is updated everytime when the document is modified. [Ref: Picture1]

image

[Picture1]

Enabling TTL on Cosmos DB Collection:

You can enable TTL on a Cosmos DB collection simply by using Azure Portal –> Cosmos DB collection setting for existing or during creation of  a new collection)

TTL value needs to be set in seconds – if you need 90 days => 60 sec * 60 min * 24 hour * 90 days = 7776000 seconds

image

[Picture2]

Below is a one of the reference architecture in which Cosmos DB – TTL would be essentially useful and viable to any Iot business case:

image

[Picture3]

Hope that was helpful to get some understanding. For more references visit:  Cosmos DB Documentation

Introduction to Data Science

June 3, 2017 Analytics, Big Data, Big Data Analytics, Big Data Management, Cloud Computing, Cold Path Analytics, Data Analytics, Data Collection, Data Hubs, Data Science, Data Scientist, Edge Analytics, Emerging Technologies, Hot Path Analytics, Human Computer Interation, Hype vs. reality, Industrial Automation, Internet of Nano Things, Internet of Things, IoT, IoT Devices, Keyword Analysis, KnowledgeBase, Machine Learning(ML), machine-to-machine (M2M), Machines, Predictive Analytics, Predictive Maintenance, Realtime Analytics, Robotics, Sentiment Analytics, Stream Analytics No comments

We all have been hearing the term Data Science and Data Scientist occupation become more popular these days. I thought of sharing some light into this specific area of science, that may seem interesting for rightly skilled readers of my blog. 

Data Science is one of the hottest topics on the Computer and Internet  nowadays. People/Corporations have gathered data from applications and systems/devices until today and now is the time to analyze them. The world wide adoption of Internet of Things has also added more scope analyzing and operating on the huge data being accumulated from these devices near real-time.

As per the standard Wikipedia definition goes β€œData science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.”.

Data Science requires the following skillset:

  • Hacking Skills
  • Mathematics and Statistical Knowledge
  • Substantive Scientific Expertise

aoz1BJy

[Image Source: From this article by Berkeley Science Review.]

Data Science Process:

Data Science process involves collecting row data, processing data, cleaning data, data analysis using models/algorithms and visualizes them for presentational approaches.  This process is explained through a visual diagram from Wikipedia.

Data_visualization_process_v1

[Data science process flowchart, source wikipedia]

Who are Data Scientist?

Data scientists use their data and analytical ability to find and interpret rich data sources; manage large amounts of data despite hardware, software, and bandwidth constraints; merge data sources; ensure consistency of datasets; create visualizations to aid in understanding data; build mathematical models using the data; and present and communicate the data insights/findings.

They are often expected to produce answers in days rather than months, work by exploratory analysis and rapid iteration, and to produce and present results with dashboards (displays of current values) rather than papers/reports, as statisticians normally do.

Importance of Data Science and Data Scientist:

“This hot new field promises to revolutionize industries from business to government, health care to academia.”

β€” The New York Times

Data Scientist is the sexiest job in the 21st century as per Harward Business Review.

McKinsey & Company projecting a global excess demand of 1.5 million new data scientists.

What are the skills required for a Data Scientist, let me share you a visualization through a Brain dump.

FxsL3b8

I thought of sharing an image to take you through the essential skill requirements for a Modern Data Scientist.

So what are you waiting for?, if you are rightly skilled get yourselves an Data Science Course.

Informational  Sources: