Data science at scale using Apache Flink: Dynamic model serving and real-time feature generation

Votre vidéo commence dans 10
Passer (5)
cash machine v4

Merci ! Partagez avec vos amis !

Vous avez aimé cette vidéo, merci de votre vote !

Ajoutées by admin
102 Vues
While developing Machine learning models to solve Fraud detection, delivery address autocompletes, user identity detection, etc. we faced different challenges. One of the major challenges was the real-time feature generation and dynamic ML model serving in real-time at scale.

For achieving this in real-time and on the high scale we developed our Data intelligent platform "Mitra". It’s based on Kappa+ architecture where we process all data on streams. Our core engine is based on Apache Flink with Kafka as a data queue and Rocksdb as in-memory states. We use Kafka for both Data flow and as a control stream to send dynamic control signals to our platform. We have a lot of other components in our Mitra platforms like Graph DB, ML Model server, Dynamic rule engine on streams and in-memory data lake

Key features of "Mitra Platform" which developed using Apache Flink :
- Predict results within 200 milliseconds in the distributed environment
- Generate Hundreds of features on the fly during model serving
- Serve results from deployed ML models
- Dynamic rule engine on Flink streams

We heavily use Flink’s in-memory states, CEP (Complex event processing), broadcast states and Async IO to achieve this. We have more than 60 operators and 40+ in memory states in our Flink application.

For more info read this blog: https://medium.com/razorpay-unfiltered/data-science-at-scale-using-apache-flink-982cb18848b

Our platform and architecture improved a lot after this blog. It serves 500+ e-commerce companies in India in real-time.mpletes, user identity detection, etc. we faced different challenges. One of the major challenges was the realtime feature generation and dynamic ML model serving in real-time at scale.

For achieving this in real-time and on the high scale we developed our Data intelligent platform "Mitra". It’s based on Kappa+ architecture where we process all data on streams. Our core engine is based on Apache Flink with Kafka as a data queue and Rocksdb as in-memory states. We use Kafka for both Data flow and as a control stream to send dynamic control signals to our platform. We have a lot of other components in our Mitra platforms like Graph DB, ML Model server, Dynamic rule engine on streams and in-memory data lake

Key features of "Mitra Platform" which developed using Apache Flink :
- Predict results within 200 milliseconds in the distributed environment
- Generate Hundreds of features on the fly during model serving
- Serve results from deployed ML models
- Dynamic rule engine on Flink streams

We heavily use Flink’s in-memory states, CEP (Complex event processing), broadcast states and Async IO to achieve this. We have more than 60 operators and 40+ in memory states in our Flink application.

For more info read this blog : https://medium.com/razorpay-unfiltered/data-science-at-scale-using-apache-flink-982cb18848b

Our platform and architecture improved a lot after this blog. It serves 500+ e-commerce companies in India in real-time.
Catégories
E commerce Divers

Ajouter un commentaire

Connectez-vous ou inscrivez-vous pour poster un commentaire.

Commentaires

Soyez le premier à commenter cette vidéo.