Skip to content

Digital underwriting ml #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: solutions-poc
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions source/includes/images/industry-solutions/Writing Fig1.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 17 additions & 5 deletions source/solutions-library.txt
Original file line number Diff line number Diff line change
@@ -112,7 +112,23 @@ kick-start their projects.

.. tab:: Media
:tabid: media

App-Driven Analytics
--------------------
.. card-group::
:columns: 2
:style: extra-compact

.. card::
:headline: Automating digital underwriting with Machine Learning
:url: https://deploy-preview-202--docs-atlas-architecture.netlify.app/solutions-library/Streamline-global-gaming-management/
:icon: mdb_custom_aggregation
:icon-alt: Atlas mdb_custom_aggregation icon

Leverage Machine Learning with real-time data processing and
automate digital underwriting.


Gen AI
------

@@ -130,11 +146,7 @@ kick-start their projects.
user experiences by using Atlas Database,
Atlas Vector Search, and Pymongo.



.. App-Driven Analytics
.. --------------------


.. .. tab:: Retail
.. :tabid: retail

2 changes: 1 addition & 1 deletion source/solutions-library/audio-based-AI-diagnostics.txt
Original file line number Diff line number Diff line change
@@ -83,7 +83,7 @@ maintenance needs, minimize disruptions, and improve fleet management.
equipment to identify potential issues early, ensuring patient safety, and
optimizing healthcare operations.

.. video:: https://youtu.be/-FCyBZlA2_8
.. video:: https://youtu.be/91WlXYEUEkk

A fun and quick demo of MongoDB's Vector Search capabilities for anomaly
detection through sound input, using a basic handheld fan as our makeshift wind
230 changes: 230 additions & 0 deletions source/solutions-library/digital-underwriting.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
.. _arch-center-is-digital-underwriting-machinelearning-solution:

=====================================================
Automating digital underwriting with Machine Learning
=====================================================

.. facet::
:name: genre
:values: tutorial

.. contents:: On this page
:local:
:backlinks: none
:depth: 1
:class: singlecol

Leverage Machine Learning with real-time data processing and automate
digital underwriting.

**Use cases:** `GEN AI <https://www.mongodb.com/use-cases/artificial-intelligence>`__,
`Analytics <https://www.mongodb.com/use-cases/analytics>`__

**Industries:** `Insurance <https://www.mongodb.com/industries/manufacturing>`__,`Financial Services <https://www.mongodb.com/industries/financial-services>`__,
`Healthcare <https://www.mongodb.com/industries/healthcare>`__

**Products and tools:** `Time Series <https://www.mongodb.com/time-series>`__,
`Atlas App Services <https://www.mongodb.com/atlas/app-services>`__,
`Atlas Triggers <https://www.mongodb.com/docs/atlas/app-services/triggers/>`__,
`Atlas Functions <https://www.mongodb.com/docs/realm/functions/>`__,
`Atlas Charts <https://www.mongodb.com/products/charts>`__,
`Spark Connector <https://www.mongodb.com/docs/spark-connector/current/>`__

**Partners:** `Databricks <https://www.mongodb.com/partners/databricks>`__

Solution Overview
-----------------

Imagine being able to offer your customers personalized, usage-based
premiums that take into account their driving habits and behavior. To do
this, you'll need to gather data from connected vehicles, send it to a
machine learning platform for analysis, and then use the results to
create a personalized premium for your customers. You’ll also want to
visualize the data to identify trends and gain insights. This unique,
tailored approach will give your customers greater control over their
insurance costs while helping you to provide more accurate and fair
pricing.

In the GitHub repo, you will find detailed, step-by-step instructions on
how to build the data upload and transformation pipeline leveraging
MongoDB Atlas platform features, as well as how to generate, send, and
process events to and from Databricks.

**By the end of this demo, you’ll have created a data visualization with
Atlas Charts that tracks the changes of automated insurance premiums in
near real-time:**

.. video:: https://www.youtube.com/watch?v=91WlXYEUEkk


Other Applicable Industries and Use Cases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Financial Services**: Banks and financial institutions must be able to
make sense of time-stamped financial transactions for trading, fraud
detection, and more.

**Retail**: Real-time insights into what’s going on right now.

**Healthcare**: From the modes of transportation to the packages
themselves, IoT sensors enable supply chain optimization while
in-transit and on-site.

Reference Architecture
----------------------

.. figure:: /includes/images/industry-solutions/Writing Fig1.svg
:figwidth: 1200px
:alt: An illustration shows a reference architecture

Figure 1: Reference Architecture With MongoDB

Data Model Approach
-------------------

A basic example data model to support this use case would include
customers, the trips they take, the policies they purchase, and the
vehicles insured by those policies.

This example builds out three MongoDB collections, as well two
materialized views. The full Hackloade data model which defines all the
MongoDB objects within this example can be found on GitHub.

.. figure:: /includes/images/industry-solutions/deployment-hierarchy.svg
:figwidth: 1200px
:alt: An illustration shows the MongoDB Data model approach

Figure 2: MongoDB Data model approach

Building the Solution
---------------------

A dataset including the total distance driven in car journeys is loaded
into MongoDB and a daily cron job is run every day at midnight that
summarizes the daily trips and compiles them into a document stored in a
new collection called “CustomerTripDaily.” A monthly cron job is run on
the 25th day of each month, aggregating the daily documents and creating
a new collection called “Customer Trip Monthly.” Every time a new
monthly summary is created, an Atlas function posts the total distance
for the month and baseline premium to Databricks for ML prediction. The
ML prediction is then sent back to MongoDB and added to the “Customer
Trip Monthly” document. As a final step, you can visualize all of your
data with MongoDB Charts.

.. procedure::
:style: normal

.. step:: Creating a data processing pipeline with a materialized view

The data processing pipeline component of this example consists of
sample data, a daily materialized view, and a monthly materialized
view. A sample dataset of IoT vehicle telemetry data represents
the motor vehicle trips taken by customers. It’s loaded into the
collection named ‘customerTripRaw’ (1). The dataset can be found
on
GitHub and can be loaded via MongoImport or other methods. To create a
materialized view, a scheduled trigger executes a function that runs an
aggregation pipeline. This then generates a daily summary of the raw IoT
data and places it in a materialized view collection named
‘customerTripDaily’ (2). Similarly for a monthly materialized view, a
scheduled trigger executes a function that runs an aggregation pipeline
that summarizes the information in the ‘customerTripDaily’ collection on
a monthly basis and places it in a materialized view collection named
‘customerTripMonthly’ (3).

See the following Github repos to create the data processing pipeline:

-Step 1 `Load the sample data <https://github.com/mongodb-industry-solutions/Digital-Underwriting-Usage-Based-Insurance/blob/main/src/LoadingtheSampleData.md>`__
-Step 2 `Setup a daily cron job <https://github.com/mongodb-industry-solutions/Digital-Underwriting-Usage-Based-Insurance/blob/main/src/DailyCronJob.md>`__
-Step 3 `Setup a monthly cron job <https://github.com/mongodb-industry-solutions/Digital-Underwriting-Usage-Based-Insurance/blob/main/src/MonthlyCronJob.md>`_

.. figure:: /includes/images/industry-solutions/deployment-hierarchy.svg
:figwidth: 1200px
:alt: An illustration shows on how to create a data processing pipeline

Figure 3: Creating a data processing pipeline

.. step:: Automating insurance premium calculations with a machine learning model

The decision-processing component of this example consists of a
scheduled trigger that collects the necessary data and posts the
payload to a Databricks ML Flow API endpoint. (The model was
previously trained using the MongoDB Spark Connector on
Databricks.) It then waits for the model to respond with a
calculated premium based on the miles driven by a given customer
in a month. Then the scheduled trigger updates the
‘customerPolicy’ collection to append a new monthly premium
calculation as a new subdocument within the ‘monthlyPremium’
array.

See the following Github repos to create the data processing pipeline:

-Step 4 `Setup a calculate premium trigger <https://github.com/mongodb-industry-solutions/Digital-Underwriting-Usage-Based-Insurance/blob/main/src/CalculatePremiumTrigger.md>`__
-Step 5 `Setup the Databricks connection <https://github.com/mongodb-industry-solutions/Digital-Underwriting-Usage-Based-Insurance/blob/main/src/DatabricksConfiguration.md>`__
-Step 6 `Write the machine learning model prediction to MongoDB <https://github.com/mongodb-industry-solutions/Digital-Underwriting-Usage-Based-Insurance/blob/main/src/Prediction.md>`_

.. figure:: /includes/images/industry-solutions/deployment-hierarchy.svg
:figwidth: 1200px
:alt: Automating Calculations with Machine Learning Model

Figure 4: Automating Calculations with Machine Learning Model




.. step:: Near-real-time insights of insurance premium changes over time

Once the monthly premium calculations have been appended, it’s
easy to set up Atlas Charts to visualize your newly calculated
usage-based premiums. Configure different charts to see how
premiums have changed over time to discover patterns.

Technologies and Products Used
------------------------------

MongoDB modern, multi-cloud database platform:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- `Atlas Database <https://www.mongodb.com/atlas/database>`__
- `Aggregation Pipelines <https://www.mongodb.com/docs/v7.0/core/aggregation-pipeline/>`__
- `Materialized Views <https://www.mongodb.com/docs/v5.0/core/materialized-views/>`__
- `Time Series <https://www.mongodb.com/time-series>`__
- `MongoDB Spark Connector <https://www.mongodb.com/products/spark-connector>`__
- `Atlas Charts <https://www.mongodb.com/products/charts>`__
- `Atlas App Services <https://www.mongodb.com/atlas/app-services>`__
-`Triggers <https://www.mongodb.com/docs/atlas/app-services/triggers/>`__
-`Functions <https://www.mongodb.com/docs/atlas/app-services/functions/>`__


Partner technologies:
~~~~~~~~~~~~~~~~~~~~~

- `Databricks <https://www.mongodb.com/partners/databricks>`__

Key Considerations
------------------

- Building materialized view on time series data: refer to steps 1-3 in
the GitHub repo.
- Leveraging aggregation pipelines for cron expressions: refer to steps
2 or 3 in the GitHub repo.
- Serving machine learning models with MongoDB Atlas data: refer to step 4
in the GitHub repo.
- Writing a machine learning model prediction to an Atlas database:
refer to step in the GitHub repo.
- Visualizing near-real-time insights of continuously changing model
results: refer to the Bonus step in the GitHub repo.


Author
------

- Jeff Needham, MongoDB
- Ainhoa Múgica, MongoDB
- Luca Napoli, MongoDB
- Karolina Ruiz Rogelj, MongoDB





4 changes: 3 additions & 1 deletion source/solutions-library/insurance-app-driven-analytics.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
.. toctree::
:titlesonly:
:titlesonly:

Digital Underwriting </solutions-library/digital-underwriting>
3 changes: 2 additions & 1 deletion source/solutions-library/insurance-gen-ai.txt
Original file line number Diff line number Diff line change
@@ -2,4 +2,5 @@
:titlesonly:

PDF Search </solutions-library/pdf-search>
Claim Management <solutions-library/claim-management>
Claim Management <solutions-library/claim-management>
Digital Underwriting </solutions-library/digital-underwriting>
3 changes: 2 additions & 1 deletion source/solutions-library/manufacturing-iot.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.. toctree::
:titlesonly:

Audio-Based AI Diagnostics <solutions-library/audio-based-AI-diagnostics>
Audio-Based AI Diagnostics <solutions-library/audio-based-AI-diagnostics>
Digital Underwriting </solutions-library/digital-underwriting>