Gemini API and How to Use It

Artificial intelligence becomes truly powerful when it moves from a chat window into automation. Instead of manually asking questions, we can build programs that send thousands of requests, generate reports, prepare lecture notes, summarize books, or build datasets while we sleep. The bridge between an idea and such automation is the API key. Understanding how this key works, how it is protected, and how it is used inside real code is the first major step toward becoming an effective AI engineer.

This article explains everything from the ground up. No assumption is made about prior API experience. By the end, you will understand authentication, request construction, response parsing, error handling, retries, and safe deployment practices. We will also dissect a working Bash program line by line so that every symbol becomes meaningful rather than mysterious.

What Is an API?

An API, or Application Programming Interface, is a formal method that allows software systems to talk to each other. When your program sends a request to a model, it is essentially writing a structured letter that says: here is my question, here are my parameters, please compute and return an answer. The remote service receives the request, processes it using its infrastructure and models, and sends back a response.

Unlike a web page meant for humans, an API is meant for machines. Therefore communication is precise, standardized, and usually formatted in JSON.

What Is an API Key?

An API key is a secret token that identifies who is making the request and determines:

whether access is allowed
which models are available
how billing is tracked
what quotas apply

Without the key, the server cannot authenticate you. With an invalid key, the request is rejected. With a valid key, usage is recorded against your project.

Think of it as a combination of identity card, meter, and permission slip.

Why Security Matters

If someone else obtains your key, they can run requests that you pay for. Therefore keys must never be placed in public repositories, screenshots, or blog posts. Professional workflows store them in environment variables or secret managers rather than directly inside code files.

Where the Key Lives

Most command line setups use environment variables. Instead of writing the key in the script, we write:

export GOOGLE_API_KEY=“AIza…”

The script can read this value, but it is not permanently baked into the source code. This separation is fundamental for security, collaboration, and deployment.

The Big Picture of a Request

Every AI API call follows the same lifecycle:

prepare input
authenticate
send request
receive JSON
extract useful text
handle failure if necessary

Once you understand this cycle, every language and framework becomes easier.

Example Program

We will now analyze the following script in detail. This script reads names from a file, sends them to the Gemini API, and writes the generated paragraphs to an output file. It also includes retry logic to handle transient failures. For simplicity, we will use Bash and curl, but the concepts apply universally.

#!/bin/bash

INPUT=“names.txt” OUTPUT=“output.txt” MODEL=“gemini-2.5-flash”

“$OUTPUT”

while IFS= read -r name || [ -n “$name” ]; do echo “Processing: $name”
echo “## $name" >> "$OUTPUT”

for i in {1..3}; do
result=$(curl -s -H "Content-Type: application/json" "https://generativelanguage.googleapis.com/v1/models/${MODEL}:generateContent?key=${GOOGLE_API_KEY}”
-d “{”contents”: [{ “parts”: [{ “text”: “Write one concise academic
paragraph about the life and contributions of ${name}.” }] }] }” | jq -r
‘.candidates[0].content.parts[0].text’)

if [ -n "$result" ] && [ "$result" != "null" ]; then
    echo "$result" >> "$OUTPUT"
    break
fi

echo "Retry $i..."
sleep 2

done

echo “” >> “$OUTPUT”

done < “$INPUT”

echo “Done. Output saved in $OUTPUT”

Line by Line Explanation

#!/bin/bash

This line is called the shebang. It tells the operating system that the file must be executed using the Bash interpreter. Without it, the script might run under the wrong shell or fail entirely. In production environments this line guarantees consistent behavior across systems.

INPUT, OUTPUT, MODEL

These are configuration variables. They allow us to change the behavior of the program without touching the internal logic.

INPUT → the file from which data will be read
OUTPUT → the file where generated content will be written
MODEL → the AI model responsible for generating the response

Keeping these values at the top of the script is a best practice. It makes maintenance easy and reduces the chance of accidental errors when adapting the program for new tasks.

> "$OUTPUT"

This command clears the output file before the program begins.

If this step is skipped, new runs would append results to old ones, creating duplication and confusion. Initializing files ensures reproducibility and clean experiments.

Reading the Input File

while IFS= read -r name || [ -n "$name" ];

This loop reads one line at a time from names.txt. Each line becomes the input for a separate API request.

Let us unpack it:

IFS= prevents trimming of leading or trailing spaces.
read -r avoids interpretation of backslashes.
|| [ -n "$name" ] ensures the final line is processed even if the file does not end with a newline.

This design makes the script robust for real-world text files.

Progress Indicator

echo "Processing: $name"

Automation can run for minutes or hours. Without feedback, users may think the program has frozen. Printing progress messages gives visibility, builds trust, and helps debugging if something stops midway.

Writing Headings

echo "## $name" >> "$OUTPUT"

Before storing the generated paragraph, we insert a heading containing the person’s name. This keeps the output structured and easy to navigate later. If converted into Markdown, PDF, or HTML, these headings can become sections automatically.

The Heart: curl

curl is the component that communicates with the remote AI server. It performs an HTTP request and captures the response.

Important pieces:

-H "Content-Type: application/json" → tells the server we are sending JSON.
The URL → specifies the model and method.
?key= → passes authentication credentials.
-d → contains the body of the request.

Without curl (or an equivalent HTTP client), no connection to the AI is possible.

Understanding the JSON Body

{
  "contents": [{
    "parts": [{
      "text": "Write one concise academic paragraph..."
    }]
  }]
}

This structure contains your actual prompt. Everything else in the API call is logistics. These lines are the intellectual instruction.

You can modify this text to change tone, size, format, or depth of the response.

Extracting the Answer

The server sends back a complex JSON object that may include metadata, safety information, and multiple candidates. We usually want only the primary generated text.

‘jq’ helps us select it.

    .candidates[0].content.parts[0].text

This means:

take the first candidate → open its content → take the first part → extract the text.

Once extracted, we can treat it like ordinary output.

Retry Logic

Real networks are unreliable. Temporary failures can occur due to rate limits, congestion, or server restarts.

for i in {1..3}

The script attempts the request up to three times. If one attempt succeeds, the loop exits. This small addition dramatically increases the chance of completing long jobs.

Professional systems almost always include retries.

Validation

if [ -n "$result" ] && [ "$result" != "null" ];

Even when the server replies, the text might be empty. This condition ensures we only write meaningful data.

Without validation, output files could fill with blank sections.

Delay Between Attempts

sleep 2

Pausing between retries prevents overwhelming the server and reduces the likelihood of repeated failures. It also helps remain within rate limits.

Completion Notice

After all lines have been processed, the script prints a final message. This is useful when the program runs unattended, such as in scheduled tasks or remote machines.

Error Handling

To print the error message from the response, you can modify the script as follows:

error_message=$(echo "$result" | jq -r '.error.message // empty')
if [ -n "$error_message" ]; then
    echo "Error: $error_message"
fi

This addition captures any error message returned by the API and prints it to the console, helping with debugging and understanding failures.

Gemini AI Pricing and Cost Management

Understanding pricing is essential before running large scale automation. AI models are powerful, but they operate on expensive infrastructure, and therefore usage is measured and billed carefully. If you know how tokens translate into money, you can design prompts intelligently, estimate budgets, and prevent unpleasant surprises. What follows is a simplified conceptual overview intended to help beginners think like engineers rather than casual users.

Prices evolve over time as providers update hardware, introduce optimizations, or launch new generations of models. Always verify numbers on the official dashboard before committing to production. Still, approximate figures are extremely useful for planning.

Simplified Pricing Snapshot

Model	Input Tokens	Output Tokens	Free Tier?
Gemini 1.5 Flash	$0.00025 / 1K	$0.00050 / 1K	Yes
Gemini 1.5 Pro	$0.00300 / 1K	$0.01500 / 1K	Yes (limited)

1K = 1,000 tokens.

A token is not exactly a word. In English text, one token is often about three quarters of a word, though this varies. The important lesson is that both what you send and what you receive are counted.

Free usage is typically available in environments such as AI experimentation studios or productivity integrations, but these come with daily or monthly ceilings. Once automation begins, API billing usually applies. For readers in India, the estimated price of using Gemini 1.5 Flash (known for being fast and efficient) per 10,000 tokens is roughly:

Input (text or image): about 0.0007 USD, which is approximately ₹0.06–₹0.07.
Output (text or image): about 0.003 USD, which is approximately ₹0.25–₹0.26.

How Charging Actually Happens

Every request contains two cost components.

First is the input. This includes your prompt, instructions, context, examples, and sometimes previous conversation. Longer prompts mean higher input cost.

Second is the output. If you ask for long essays, explanations, or detailed reports, the model will generate more tokens and the bill increases.

Therefore, prompt design is financial design.

Real-World Example of Token Consumption

Imagine writing a short email of about 100 words and asking the model to rewrite it professionally.

Possible usage might be:

input → 120 tokens
output → 180 tokens
total → 300 tokens

At professional tier rates, that may translate into only a few cents. For individual requests this feels trivial, which is why AI adoption grows so quickly. However, when multiplied by thousands or millions of operations, cost awareness becomes crucial.

Why Small Inefficiencies Become Big

Suppose your prompt accidentally includes unnecessary history or repeated instructions. Maybe you send 500 tokens when only 150 were needed. If you run the job ten thousand times, the waste becomes enormous. Efficient engineers therefore minimize verbosity while preserving clarity.

Optimization at scale is rarely dramatic; it is incremental. Yet incremental improvements accumulate.

Is Gemini Free?

For many casual users, yes. Web interfaces and office integrations often provide limited complimentary access. These platforms are ideal for experimenting with prompts, exploring behavior, and validating ideas.

Developers, researchers, and institutions who require automation, reliability, or large throughput typically move to paid API usage. Payment unlocks control, monitoring, and predictable availability.

The Difference Between Free and API Access

Free environments are designed for humans. They emphasize convenience and interface.

APIs are designed for systems. They emphasize repeatability, measurement, and integration.

If you want nightly batch generation, automatic document processing, or embedding into software products, API usage is the path forward.

Practical Budget Thinking

When planning a project, it helps to estimate.

Ask yourself:

How many requests per day?
How long is each prompt?
How long is each expected answer?

Multiply and you obtain daily tokens. From tokens, cost becomes visible. This allows you to make strategic decisions before deployment.

Smart Techniques to Reduce Spending

Clear instructions reduce unnecessary elaboration. If the model understands exactly what is required, it will produce tighter responses.

Output limits are powerful. Requesting a paragraph instead of a page can divide costs by five or ten.

Reusing context can also help. Rather than repeating background every time, maintain continuity when possible.

Testing ideas in free environments before automating them prevents expensive trial and error.

Layered Model Strategy

Many professional systems combine tiers. A cheaper model may prepare, filter, or classify material. Only difficult cases are escalated to premium reasoning engines. This architecture can cut spending dramatically while preserving quality.

Monitoring and Alerts

As soon as real money is involved, visibility becomes mandatory. Dashboards, alerts, and daily reviews ensure anomalies are detected early. Unexpected spikes usually indicate bugs or loops rather than genuine demand.

The Mindset Shift

Beginners think in prompts. Experts think in pipelines.

Once you adopt the pipeline perspective, pricing becomes another design variable, similar to speed or accuracy. You continuously balance them.