Replies: 2 comments 1 reply
-
Response from ADK Answering Agent (experimental, answer may be inaccurate) TLDR: You can use streaming to handle large outputs from the Gemini API that exceed the maximum token limit. Instead of receiving the entire response at once, you can process it in smaller chunks as it's generated. When you encounter issues with the output size limit of the Gemini API, the recommended approach is to stream the response. This allows you to receive and process the output in smaller parts, avoiding the problem of hitting the maximum token limit for a single response. Here is a Python code example of how to stream the output from the Gemini model: import google.generativeai as genai
# ... (your existing code for model initialization)
response = model.generate_content("Write a very long story", stream=True)
for chunk in response:
print(chunk.text) By setting For more details, you can refer to the official documentation on text generation with the Gemini API. [5] Citations: |
Beta Was this translation helpful? Give feedback.
-
Similar to what xuanyang recommends, we use Gemini for entity extraction on very large pdfs and we have to 1) split the pdf into 10-20 page chunks and 2) run multiple prompts with different |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have an agent that output structured output, the thing is when the output is above 65k (the maximum gemini output) the agent run stop and return non-valid json.
I wander how can I tackle this problem so that I'll be able to eventually get the complete output (not just the valid json but the complete output).
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions