import { Contact } from "./contact";

export const News4 = () => {
  return (
    <div id="news1">
      <div className="container">
        <div className="col-md-8 col-md-offset-2 section-title text-start">
          <h2>Data labeling will fuel the AI revolution</h2>
          <p>
            AI fuels modern life — from the way we commute to how we order
            online, and how we find a date or a job. Billions of people use
            AI-powered applications every day, looking at just Facebook and
            Google users alone. This represents the tip of the iceberg when it
            comes to AI’s potential.
          </p>
          <h3></h3>
          <p>
            OpenAI, which recently made headlines again for offering general
            availability to its models, uses labeled data to “improve language
            model behavior,” or to make its AI fairer and less biased. This is
            an important example, as OpenAI’s models were long reprimanded for
            being toxic and racist.
          </p>
          <h3></h3>
          <p>
            Many of the AI applications we use day-to-day require a particular
            dataset to function well. To create these datasets, we need to label
            data for AI.
          </p>
          <h3></h3>
          <h3>Why does AI need data labeling?</h3>
          <h3></h3>
          <p>
            The term artificial intelligence is somewhat of a misnomer. AI is
            not actually intelligent. It takes in data and uses algorithms to
            make predictions based on that data. This process requires a large
            amount of labeled data.
          </p>
          <h3></h3>
          <p>
            This is particularly the case when it comes to challenging domains
            like healthcare, content moderation, or autonomous vehicles. In many
            instances, human judgment is still required to ensure the models are
            accurate.
          </p>
          <h3></h3>
          <p>
            Consider the example of sarcasm in social media content moderation.
            A Facebook post might read, “Gosh, you’re so smart!” However, that
            could be sarcastic in a way that a robot would miss. More
            perniciously, a language model trained on biased data can be sexist,
            racist, or otherwise toxic. For instance, the GPT-3 model once
            associated Muslims and Islam with terrorism. This was until labeled
            data was used to improve the model’s behavior.
          </p>
          <h3></h3>
          <p>
            As long as the human bias is handled as well, “supervised models
            allow for more control over bias in data selection,” a 2018
            TechCrunch article stated. OpenAI’s newer models are a perfect
            example of using labeled data to control bias. Controlling bias with
            data labeling is of vital importance, as low-quality AI models have
            even landed companies in court, as was the case with a firm that
            attempted to use AI as a screen reader, only to have to later agree
            to a settlement when the model didn’t work as advertised.
          </p>
          <h3></h3>
          <p>
            The importance of high-quality AI models is making its way into
            regulatory frameworks as well. For example, the European
            Commission’s regulatory framework proposal on artificial
            intelligence would subject some AI systems to “high quality of the
            datasets feeding the system to minimize risks and discriminatory
            outcomes.”
          </p>
          <h3></h3>
          <p>
            Standardized language and tone analysis are also critical in content
            moderation. It’s not uncommon for people to have different
            definitions of the word “literally” or how literally they should
            take something such as “It was like banging your head against a
            wall!” To decide which posts are violating community standards, we
            need to analyze these types of subtleties.
          </p>
          <h3></h3>
          <p>
            Similarly, the AI startup Handl uses labeled data to more accurately
            convert documents to structured text. We’ve all heard of OCR (Object
            Character Recognition), but with AI-powered by labeled data, it’s
            being taken to a whole new level.
          </p>
          <h3></h3>
          <p>
            To give another example, to train an algorithm to analyze medical
            images for signs of cancer, you would need to have a large dataset
            of medical images labeled with the presence or absence of cancer.
            This task is commonly referred to as image segmentation and requires
            labeling tens of thousands of samples in each image. The more data
            you have, the better your model will be at making accurate
            predictions.
          </p>
          <h3></h3>
          <p>
            Sure, it’s possible to use unlabeled data for AI training
            algorithms, but this can lead to biased results, which could have
            serious implications in many real-world cases.
          </p>
          <h3></h3>
          <h3>Applications using data labeling</h3>
          <h3></h3>
          <p>
            Data labeling is vital for applications across search, computer
            vision, voice assistants, content moderation, and more.
          </p>
          <h3></h3>
          <p>
            Search was one of the first major AI use-cases relying on human
            judgment to determine relevance. With labeled data, a search can be
            extremely accurate. For instance, Yandex turned to human
            “annotators” from Toloka to help improve its search engine.
          </p>
          <h3></h3>
          <p>
            Some of the most popular uses of AI in health care include helping
            to diagnose skin conditions and diabetic retinopathy, boosting
            recall rates for medication compliance reviews, and analyzing
            radiologist reports to detect eye conditions like glaucoma.
          </p>
          <h3></h3>
          <p>
            Content moderation has also seen significant advances thanks to AI
            applied to large quantities of labeled data. This is especially true
            for sensitive topics like violence or threats of violence. For
            example, people may post videos on YouTube threatening suicide,
            which need to be immediately detected and differentiated from
            informational videos about suicide.
          </p>
          <h3></h3>
          <p>
            Another important use of AI for data labeling is understanding
            voices with any accent or tone, for voice assistants like Alexa or
            Siri. This requires training an algorithm to recognize male and
            female speech patterns based on large volumes of labeled audio.
          </p>
          <h3></h3>
          <h3>Human computing for labeling at scale</h3>
          <h3></h3>
          <p>
            All this begs the question: How do you create labeled data at scale?
          </p>
          <h3></h3>
          <p>
            Manually labeling data for AI is an extremely labor-intensive
            process. It can take weeks or months to label a few hundred samples
            using this approach, and the accuracy rate is not very good,
            particularly when facing niche labeling tasks. Additionally, it will
            be necessary to update datasets and build bigger datasets than
            competitors in order to remain competitive.
          </p>
          <h3></h3>
          <p>
            The best way to scale data labeling is with a combination of machine
            learning and human expertise. Companies like Toloka, Appen, and
            others use AI to match the right people with the right tasks, so the
            experts do the work that only they can do. This allows firms to
            scale their labeling efforts. Further, AI can weigh the answers from
            different respondents according to the quality of the responses.
            This ensures that each label has a high chance of being accurate.
          </p>
          <h3></h3>
          <p>
            With techniques like these, labeled data is fueling a new AI
            revolution. By combining AI with human judgment, companies can
            create accurate models of their data. These models can then be used
            to make better decisions that have a measurable impact on
            businesses.
          </p>
        </div>
      </div>
      <Contact />
    </div>
  );
};
