Skip to content

Design Concepts

Why Data-Centric

The AI landscape has been revolutionized by data technology, with large language models empowering organizations to create "super employees" for unprecedented productivity gains. While deep learning algorithms are widely available and computing power is readily accessible, the true differentiator lies in the quality and refinement of training data.

Enterprises require a data pipeline that is not only discoverable, manageable, and collaborative but also iterative, enabling continuous refinement to achieve model perfection. This represents a paradigm shift towards data-centric internal collaboration, moving away from traditional human-centric approaches. By embracing this new paradigm, businesses can unlock greater efficiency and secure a significant competitive advantage in the AI 2.0 era.

MorningStar emerges as a cutting-edge workspace designed specifically for data-centric model development. It charts a unique yet proven path in the AI 2.0 landscape by encompassing the entire data workflow, from training to production. This includes comprehensive data management, iteration, optimization, and mining capabilities. MorningStar is dedicated to empowering businesses to establish efficient full-stack data pipelines, maximizing the value of their data and unlocking the full potential of their AI models.

About MorningStar

Empowering AI 2.0 with Unparalleled Data Management



MorningStar is a cutting-edge, full-stack AI data platform designed specifically for the AI 2.0 era. It streamlines the management of unstructured data, empowering algorithm engineers and data scientists to unlock the true potential of their data and accelerate the development of innovative algorithms.

Key Features & Benefits

Optimized Data Lifecycle Management: Seamlessly manage your data from acquisition to deployment, ensuring data quality and consistency throughout the entire lifecycle. Comprehensive Data Exploration Toolkit: Uncover hidden patterns and insights within your data using powerful visualization and analysis tools. Enhanced Metrics Tracking & Model Reproducibility: Accurately track model performance, enabling data-driven decisions and ensuring reproducibility for reliable results. Streamlined & Regulatory-Compliant Data Asset Management: Efficiently manage your data assets while adhering to industry regulations and best practices.

MorningStar's capabilities empower you to:

  • Access Data Seamlessly: Import data locally, connect to databases, and integrate with cloud storage solutions like OSS, OBS, and S3.
  • Visualize Data Intuitively: Gain insights into your data through comprehensive visualizations of images, videos, point clouds, audio, and text, with multi-modal data rendering and version comparison.
  • Manage Data Throughout its Lifecycle: Slice, version, and process your data with custom workflows and prompt management, ensuring optimal data utilization.
  • Manage & Evaluate Algorithms: Register and invoke models, then evaluate their performance using single or multi-model metrics and identify challenging cases for improvement.
  • Explore Data In-Depth: Perform semantic and cross-modal searches, compare data against ground truth, and uncover valuable insights.
  • Ensure Data Security: Protect your valuable data assets with unified authentication, end-to-end encryption, and robust anomaly detection.

Glossary of Key Terms

TermDescription
DatasetA collection of data managed on the platform, serving as the fundamental unit for data management.
Data FlowA user-defined sequence of steps that process and refine data within a dataset, allowing for iterative development and optimization.
ComponentThe building block of a data flow, representing individual data processing operations or functions.
SliceA logical subset of a dataset, defined by user-specified criteria, enabling focused analysis or processing of specific data segments.
VersionAn iteration or snapshot of a dataset resulting from data processing or modifications, facilitating traceability and reproducibility.
SemanticsThe underlying meaning or interpretation of data within a dataset, often captured through metadata or annotations.
RoleA designation that defines the level of access and permissions granted to users within the platform, ensuring data security and control.
TeamThe smallest organizational unit within the platform, typically comprising a group of users collaborating on data-related projects or tasks.

MorningStar sets itself apart from competitors by offering a comprehensive, user-friendly platform that simplifies the complexities of data management in the AI 2.0 era. Experience the future of AI data management with MorningStar and unlock the full potential of your data.


Rosetta:

Your Comprehensive AI Labeling and Data Management Solution

Rosetta is a powerful and versatile platform designed to streamline your AI labeling and data management workflows. It can be seamlessly integrated into MorningStar or used as a standalone product, catering to various data annotation and human feedback needs.

Rosetta supports a wide range of data types, including 3D point clouds, 2D images, text, audio, video, and more. Its low-code data management approach allows for customizable workflows, ensuring efficient data access, labeling, inspection, sampling, and delivery across diverse scenarios and accuracy requirements. In addition, Rosetta provides a suite of project management tools to enhance your team's productivity and collaboration.

Unlock the Power of Rosetta

Unrivaled Data Coverage

  • All-Encompassing: Handle projects involving point clouds, images, text, audio, video, and data collection with ease.
  • Extensive Labeling Scenarios: Choose from 100+ pre-built labeling scenarios to accelerate your projects.
  • Granular Labeling: Hundreds of labeling functions cater to even the most intricate annotation details.

Scalable and Efficient

  • Massive Teams: Build teams of up to tens of thousands of users.
  • Global Collaboration: Enable simultaneous work for thousands of users worldwide.
  • High-Volume Projects: Label millions of items within a single project effortlessly.
  • Advanced Tagging: Assign tens of thousands of tags to a single item without any performance issues.

Accelerated Labeling

  • Rapid Project Setup: Create projects and start labeling within minutes.
  • AI-Powered Pre-labeling: Leverage OCR, ASR, semantic segmentation, and vehicle MOT algorithms to automate labeling with over 80% accuracy, significantly boosting efficiency.
  • Real-Time Feedback: Receive immediate quality inspection feedback from algorithms and rules to correct errors promptly.
  • High Throughput: Generate over 100,000 qualified labels per hour.
  • Seamless Integration: Easily import and export data for model training using minimal code.

Precision Quality Assurance

  • Granular Inspection: Inspect labels at a granular level, identify error reasons and types, and analyze common project mistakes.
  • Detailed Statistics: Track progress and billing at the label level, with automated daily email reports.
  • Multiple Inspection Methods: Choose from at least six quality inspection techniques, including algorithm-based, rule-based, full-sample, sampling, and multi-layer inspections, with cross-checking for subjective tasks.
  • Exceptional Accuracy: Achieve 99.9% accuracy for image and text projects and 99.5% for point cloud and video projects.

Flexible Workflow Configuration

  • Dynamic Labeling: Add, delete, or modify labels at any stage of the project to refine annotation rules.
  • Industry-Leading Workflow Management: Employ a unique DAG (Directed Acyclic Graph) workflow management system for simultaneous and flexible management of multiple teams and users within a project.
  • Innovative Label Relationships: Define parent-child nested label relationships for optimal label configuration and enhanced efficiency.

Key Terminology

TermDescription
InstanceLabeled result on the annotation page
Labeling ToolTool used for annotation, such as 3D/2D bounding boxes
TaskThe smallest unit of annotation work
WorkflowOrganized sequence of tasks (DAG diagram)
Auxiliary AlgorithmAlgorithm that assists human annotators
Pre-labeling AlgorithmAlgorithm that automates labeling before human intervention
Quality Inspection Algorithm/RuleAlgorithm or rule used to validate submitted annotations