In-person + Virtual
October 24-28
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2022 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Eastern Daylight Time (UTC -4). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Back To Schedule
Friday, October 28 • 4:00pm - 4:35pm
Training AI To Code Using the Largest Code Dataset - Tommy Li & Animesh Singh, IBM

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.

Project CodeNet is a large dataset of 14 million code samples totaling 500 million lines of code in 55 programming languages. It enables machine learning for code, like finding code similarity, extracting semantic context, and even translating between different programming languages. Using the Machine Learning Exchange (MLX), a Linux Foundation for AI & Data Sandbox Project, we demonstrate how Project CodeNet can be leveraged to classify code and analyze code complexity in three steps. Using DataShim we turn domain specific subsets of the data into Kubernetes Custom Resources. Running Jupyter notebooks on Kubernetes we use the datasets to train deep learning models. The models are then served for inferencing as Kubernetes Custom Resources using KServe. For each of these steps, MLX generates Kubeflow Pipelines on Tekton so data scientists are not required to write Kubernetes specific code.

avatar for Animesh Singh

Animesh Singh

Distinguished Engineer and CTO - Watson Data and AI OSS Platform, IBM
Animesh Singh is CTO and Director for IBM Watson Data and AI Open Technology, responsible for Data and AI Open Technology strategy. Creating, designing and implementing IBM’s Data and AI engine for AI and ML platform, leading IBM`s Trusted AI efforts, driving the strategy and execution... Read More →
avatar for Tommy Li

Tommy Li

Senior Software Developer, IBM
Tommy Li is a senior software developer in IBM focusing on Cloud, Kubernetes, and Machine Learning. He is one of the Kubeflow committers and worked on various open-source projects related to Kubernetes, Microservice, and deep learning applications to provide advanced use cases on... Read More →

Friday October 28, 2022 4:00pm - 4:35pm EDT
251 ABC
  Machine Learning + Data