BIOVIA Pipeline Pilot



Data has become ubiquitous. However, many scientific and engineering-based organizations still struggle to effectively utilize the data at their disposal. Teams use different tools and processes to access data, clean it, model it and share results. However, these results often lack the technical depth needed to drive innovation. This disjointed and often too generic approach to data science hinders the development of effective solutions, stifles collaboration and lowers trust in results.

To fully benefit from the potential data science offers, organizations need an end-to-end approach to leverage their data across the science and engineering enterprise.

Advanced Data Science, AI, And Machine Learning: Democratized

The widespread adoption of AI and machine learning is transforming how business is being done – especially in science and engineering. However, organizations need to ensure their teams are focusing on fundamentals, not fads. Pipeline Pilot allows data scientists to train models with only a few clicks, compare the performance of model types and save trained models for future use. Expert users can also embed custom scripts from Python, Perl or R to maximize their use across the organization. Additionally, Pipeline Pilot is not a “black box.” Since every model is tied to a protocol, organizations have insight into where the data comes from, how it is cleaned and what models generate the results. End users can trust predictions and augment their scientific work with the latest machine learning techniques. Maximize the value of AI and machine learning for everyone.

Enterprise Data Science Solutions: Faster

With the demand for custom data science solutions on the rise, developers need ways to streamline protocol creation. Pipeline Pilot wraps complex functions in simple drag-and-drop components that can be strung into a workflow. These protocols can be shared between users and groups for reuse, ensuring that solutions are developed faster and standardized across the organization.
Make best practices standard practices.

Manual Data Science Workflows: Automated

Many data analytics processes, from pivoting spreadsheets to image analysis, still require tedious, manual steps. The magic of Pipeline Pilot lies in automating these processes. Schedule tasks, clean and blend datasets, and set up interactive dashboards. Protocols can also be set up as web services, simplifying the end user experience. Remove non-value added tasks and let your scientific team spend its time on what it does best: science.

Support End-To-End Data Science Workflows​

Between developers, data scientists, “citizen” data scientists and business leaders, data science solutions require comprehensive configurability and extensibility to be effective. Pipeline Pilot supports end-to-end automated workflow creation and execution. Connect to internal and external data sources. Access and parse all kinds of scientific data. Prep data for analysis. Build, train and maintain models. Deploy services where they’re needed, when they’re needed. All in a single workflow.


Pipeline Pilot streamlines the building, sharing and deployment of complex data workflows with its graphically-based, code-optional environment purpose-built for science and engineering. Developers can rapidly create, share and reuse complex workflows. Data Scientists can automate data access, cleaning and model creation. Researchers can augment their work with the latest AI and machine learning tools. Business leaders can quickly and easily interpret results in interactive custom dashboards.

Source data when it’s needed, no migration required
Read/write data from internal databases, Hadoop warehouses, cloud applications, flat files and more:

  • Tabular data
  • Structured (JSON, XML)
  • Office Doc (PPT, Excel, etc.)
  • PDF
  • Streaming (KAFKA)
  • 3DEXPERIENCE Platform
  • Connect to external data sources via an extensive library of APIs
  • Out-of-the-box connections to various 3rd party databases


Automate data preparation and handling
  • Clean and blend data via user set parameters
  • Schedule tasks to automate data preparation
  • Share complete protocols to minimize the time spent setting up analyses


Operationalize discipline-specific capabilities and analyses

  • Utilize off-the-shelf components to run scientific calculations on your data
  • Drive analyses with technical data such as:
    • Images
    • Spectral data
    • DNA/RNA/Protein Sequences
    • Chemistry
    • Text
    • Streaming (IoT/IoE)
    • Financial
    • Geology/Location data

Democratize powerful machine learning and predictive analytics

  • Automate model building using over 30 supervised and unsupervised machine learning algorithms, including:
  • Random Forest, XGBoost, Neural Networks, Linear Regression, Support Vector Machines, Principle Component Analysis (PCA), Genetic Function Approximation (GFA)
  • Apply deep learning models and connect custom scripts from Python, R, Perl and more directly into protocols
  • Validate model performance and compare different approaches side-by-side
  • Share trained models to streamline development


Maximize the impact of your data science and development teams

  • Deploy protocols as web services, APIs or deploy to our built-in web interface (Web Port)
  • Create and combine interactive charts and visuals for key stakeholders
    • Fully HTML 5 compliant
    • Graphs include: Scatter plot, Line Plots, Pie Charts, Bar Plots, Histograms, Polar Plots and more
    • Interactive tables with sorting, search and more
Scroll to Top