Float16.cloud Seed Round

Float16.cloud Seed Round

Hi everyone, I’m Mati, founder of Float16.cloud. I built Float16.cloud, starting 9 months ago in October 2023 to develop the platform, and launched it for the first time in January 2024.

After launching the first version of Float16, I received several pieces of feedback and I want to say thank you for your support and valuable input. I am also excited to announce that we have received funding from our investor.

Today, I am very proud to announce the first version of our core values and roadmap.

I believe this announcement will help you to deeply understand what Float16.cloud is aiming to build and the core values we are dedicated to.

Core Values

1. Developers and Community come first.

2. Fundamentals come first, then applications and solutions.

3. The multiple cost strategy.

We dedicate ourselves to building Float16.cloud as a home for developers and AI researchers. We are committed to providing everything necessary to adopt AI as part of the future and your applications. On the other hand, as an AI researcher with a background in machine learning engineering and optimization inference for the past 6 years, I am building this platform and community to help you avoid worrying about installation, resource management, and to support your journey from training to production.

Fundamentals have been integral to the platform since day one. We've noticed several platforms and tools that encourage people to focus on "tools" rather than "fundamentals". While this approach is fine for quick-win solutions and applications, it is not great for long-term benefits for the developer.

I am going to craft the fundamentals on how to utilize and clarify the ALL MAGIC TRICKS of every tool. This core value should be more sustainable than focusing solely on solutions and use cases.

Finances are always a problem when developing projects and implementing new features. As a FinOps Practitioner, I am committed to offering multiple cost strategies and an API to manage costs across the platform. This should enable more flexibility for developers when developing new features and using them to calculate the project cost structure.


- Announce the official channel for news, updates, and support.

- Redesign the website.

- Redesign documentation.

- Implement a new billing system.

- Implement a new account and organization system.

We received several pieces of feedback about the communication channel and finally, I am going to announce official communication across several channels listed here:

- News and Updates.

X, LinkedIn, Facebook, Discord, Blog, YouTube (Coming soon), GitHub (Coming soon), Email subscription

- Support.

Email, Discord, Facebook group, Slack, GitHub (Coming soon)

The website and documentation will be revamped, with consistency in documentation and GitHub examples.

The billing system and account system will be rolled out in a new version to enhance capabilities for multiple pricing strategies.

GPU Features

- One-click deployment of Huggingface inference endpoints.

- Large Language Model (LLM) inference benchmark dashboard.

- One-click training for Multi-node GPU.

- Serverless GPU.

- Float16 API and SDK.

Key components of the new Float16 are GPU features. GPUs are essential and critical parts of AI products. They come with huge compute power but also with the complexity of managing them.

We are committed to providing multiple GPU features to serve as many as we can.

One-click deployment may seem simple, but we will pack quality into this feature. We aim to reduce pricing by 50% - 70% compared to on-demand rates.

The benchmark dashboard is crucial for calculating the pricing of hosting your own LLM. We are going to publish a dashboard in the same style as vantage.sh to increase the feasibility of LLM projects.

Multi-node training is expensive and requires a lot of engineering skill. It is very complicated to achieve training speeds as claimed by providers. We are going to build one-click training and ensure the speed matches the claimed benchmark.

Serverless GPU will become a reality after our community of GPU developers has been working on it for 8 years. Our serverless GPU feature will scale to zero while minimizing cold starts (less than 10 sec, average 5 sec). Serverless GPU will help startups and hobby projects access more GPUs and become GPU-rich.

APIs and SDKs are core components to manage project costs. As a FinOps Practitioner, I will allow developers full control of their accounts to ensure they get the best optimization of both performance and cost.


Another contribution I want to make to the community is knowledge sharing. We have built a playlist for LlamaIndex to test feedback and measure metrics. We found several developers coming to thank us and commenting very positively.

To align with the community, we are open to community requests for playlists or any fundamental knowledge necessary to adopt LLM as part of a project, product, or service.

And the knowledge and playlists will always be free. Come to us with your requests, and we will create playlists for the community.


My team and I are very eager to hear your feedback, and we still need your feedback along our journey.

Do not hesitate to provide feedback to our team; we are always open to it.

Thank you for reading, and I hope our core values align with yours.

Read more