This website (iGadgetware) is on SALE including domain and Website content..

Current BID $1490

.. Submit your Bid to [email protected]

Google Launches AI-Model Gemini, How its Different From ChatGPT?

Google Gemini MMLU AI


Introducing Gemini: The largest and most capable AI model.

Gemini is a significant leap forward in AI's ability to improve our daily lives.

Gemini is a product of a massive collaborative effort by teams at Google. It was designed to be multimodal, which means it can easily understand, operate across, and combine various types of information such as text, code, audio, image, and video. 

Gemini was built from scratch with the goal of being able to generalize and seamlessly integrate different forms of data.

Gemini is also our most flexible model yet — able to run efficiently on everything from data centers to mobile devices. According to Google, its state-of-the-art capabilities will significantly enhance how developers and enterprise customers build and scale with AI.

Google optimized Gemini 1.0, our first version, for three different sizes:

  • Gemini Ultra — our largest and most capable model for highly complex tasks.
  • Gemini Pro — our best model for scaling across a wide range of tasks.
  • Gemini Nano — our most efficient model for on-device tasks.

How does it differ from ChatGPT?

TEXT



Capability



Benchmark

Higher is better



Description



Gemini Ultra

GPT-4

API numbers calculated where reported numbers were missing


General

MMLU

Representation of questions in 57 subjects (incl. STEM, humanities, and others)


90.0% COT

@32*

86.4%

5-shot* (reported)


Reasoning


Big-Bench Hard

Diverse set of challenging tasks requiring multi-step reasoning

83.6%

3-shot


83.1%

DROP

Reading comprehension

(F1 Score)

82.4

Variable shots

80.9

3-shot (reported)


HellaSwag

Commonsense reasoning for everyday tasks

87.8%

10-shot*

95.3% (reported)



Math


GSM8K

Basic arithmetic manipulations (incl. Grade School math problems)


94.4% maj@32


92.0%


MATH

Challenging math problems (incl. algebra, geometry, pre-calculus, and others)

53.2%

4-shot

52.9%

4-shot (API)


Code



HumanEval


Python code generation

74.4%

0-shot (IT)*

67.0% O-shot* (reported)



Natural2Code

Python code generation. New held out dataset HumanEval-like, not leaked on the web

74.9%

0-shot

73.9%

0-shot (AP


Technical report by Google for details on performance with other methodologies

                            MULTIMODAL


Capability

Benchmark

Description

Higher is better unless otherwise noted

Gemini

GPT-4V

Previous SOTA model listed when capability is not supported in GPT-4V

Image

MMMU

Multi-discipline college-level reasoning problems

59.4%

Cen uita (pral ony)

56.8%

Optorpassen


VQAV2

Natural image understanding

77.8%

Genitura (piel only)

77.2%


TextVQA

OCR on natural images

82.3%

Comi ur (pic only)

78.0%

Catra


DOCVQA

Document understanding

90.9%

comi ura (piel only)

88.4%

0-shot

GPT-AV (pixel only)



Infographic VQA Infographic understanding

80.3%

Genitura (piel only)

75.1% O-shot

GPT-4V (pixel only)


MathVista

Mathematical reasoning in visual contexts

53.0%

O-shot

Gemini Ultra (pixel only*)

49.9% O-shot

GPT-AV

Video

VATEX

English video captioning

(CIDEr)

62.7

4-shot

Gemini Ultra

56.0

4-shot

DeepMind Flamingo

Perception Test

MCQA

Video question answering

54.7% O-shot

Gemini Ultra

46.3%

0-shot

SeviLA

Audio

COVOST 2

(21 languages)

Automatic speech translation

(BLEU score)

40.1

Gemini Pro

29.1

Whisper v2

FLEURS

(62 languages)

Automatic speech recognition (based on word error rate, lower is better)

7.6%

Gemini Pro

17.6%


Gemini surpasses state-of-the-art performance on a range of multimodal benchmarks.


Next-generation capabilities

The multimodal models were created by training separate components for different data types, such as text or images, and then combining them to perform certain tasks. However, these models often struggle with complex reasoning and conceptual understanding despite being good at describing images or other more straightforward tasks.

We designed Gemini to be natively multimodal and pre-trained from the start on different modalities. Then we fine-tuned it with additional multimodal data to further refine its effectiveness, according to Google blog. This allows Gemini to understand and reason about various inputs, surpassing existing multimodal models. Its capabilities are state-of-the-art across domains.

So What's New in Google Gemini AI-model?

Gemini is an AI model that has achieved better results than human experts on the MMLU (Massive Multitask Language Understanding), a widely used method for testing AI models' knowledge and problem-solving abilities. 

Understanding text, images, audio and more

Gemini 1.0 was created to understand text, images, audio, and more. This helps it to comprehend complicated topics and answer questions with ease. It's especially good at breaking down math and physics concepts.

Advanced coding

Gemini is our first version to write high-quality code in the most popular programming languages, such as Python, Java, C++, and Go. It is one of the leading foundational models for coding worldwide because it can work across languages and analyze complex information.

Gemini Ultra performs exceptionally well on coding benchmarks. It has been evaluated on HumanEval and Natural2Code benchmarks. Natural2Code is an internally developed benchmark that uses author-generated sources instead of web-based information to ensure its accuracy and reliability.

How Gemini Solve Problems?

More reliable, scalable, and efficient

Google trained Gemini 1.0 at scale on our AI-optimized infrastructure using Google's in-house designed Tensor Processing Units (TPUs) v4 and v5e. And we designed it to be our most reliable and scalable model to train and our most efficient to serve.

Google's AI-powered products, like Search, YouTube, Gmail, Google Maps, Google Play, and Android, serve billions of users worldwide. Training large-scale AI models can be costly, but Google has designed custom AI accelerators called TPUs that make it more cost-efficient. Gemini, a newer and more capable TPU model, runs much faster than earlier models. Companies around the world have been able to utilize these AI accelerators to train large-scale AI models cost-effectively.

"Today, we're announcing the most powerful, efficient, and scalable TPU system to date, Cloud TPU v5p, designed for training cutting-edge AI models, Google said. "This next-generation TPU will accelerate Gemini's development and help developers and enterprise customers train large-scale generative AI models faster, allowing new products and capabilities to reach customers sooner."

Building with Gemini

On December 13th, Gemini Pro could be accessible to developers and enterprise customers through the Gemini API in Google AI Studio or Google Cloud Vertex AI.

Google AI Studio is a web-based tool developers can use to create and launch apps using an API key quickly. If you need a customized AI platform that is fully managed, Vertex AI is the solution. With Vertex AI, you can have full control over your data and benefit from additional security, privacy, and compliance features that Google Cloud provides.

Android developers will soon have access to Gemini Nano, the most efficient model for on-device tasks to be made possible through AICore. This new system capability will be available in Android 14. The feature will be first made available on Pixel 8 Pro devices. If you're interested, you can sign up for an early preview of AICore.

Gemini Ultra coming soon

For Gemini Ultra, we're currently completing extensive trust and safety checks, including red-teaming by trusted external parties, and further refining the model using fine-tuning and reinforcement learning from human feedback (RLHF) before making it broadly available.

Google Bard - Next Level Coming

Also, Google announced that they will launch Bard Advanced early next year, which is an advanced AI experience that provides access to their best models and capabilities, starting with Gemini Ultra.

Post a Comment

Previous Post Next Post

Contact Form