Trying the Foundation Model framework and when I try to run several sessions in a loop, I'm getting a thrown error that I'm hitting a rate limit.
Are these rate limits documented? What's the best practice here?
I'm trying to run the models against new content downloaded from a web service where I might get ~200 items in a given download. They're relatively small but there can be that many that want to be processed in a loop.
I don't think we have documented the rate limit as of today, but as far as I know, an app that has UI and runs in the foreground doesn't have a rate limit when using the models; a macOS command line tool, which doesn't have UI, does.
Would you mind to share how you would use the models? In general, if you hit the rate limit, and can't work around that by switching to an app with UI, I’d suggest that you file a feedback report with your concrete use case for the Foundation Models folks to evaluate – If you do so, please share your report ID here for folks to track.
Best,
——
Ziqiao Chen
Worldwide Developer Relations.