Neuro Platform: ML Developers Forum

[01.09.2019] Google AI Platform

This overview describes the state of the platform on September 1, 2019.

Overview procedure description

Product website

Branch in benchmark project

Developer experience - 2

There are a lot of CLI tools Google gives us to control our cloud computations. I am pretty sure that with good knowledge of Google Cloud it’s possible to overcome pretty much all of the problems I faced while evaluating this platform, but for beginners, a lot of difficulties may become insurmountable.

Documentation is extremely good – a lot of examples, a lot of code snippets/console commands, a lot of explained different possibilities, pretty much everything explained in documentation actually works. But at the exact moment you go above standard actions searching for any information is very problematic. Official google groups with answers is just a pile of garbage – unpredictably long moderations of every message is just nonsense, StackOverflow contains very weak and basic answers, every Google request is overwhelmed with links to the official documentation.

AI Platform suppose default Python project architecture with setup.py file (and dependencies inside setup.py). Every job submission will copy the whole local project folder to the cloud, install there (with downloading all dependencies every time) and then running. This is pretty suitable because in a lot of aspects it looks like local development but at the same time it creates a lot of artificial constraints and makes a lot of problems.

ML environments extensibility - 1

ML environment in Google AI Platform is awful. There are several prebuilt environments for tensorflow + keras + sklearn of different versions but using anything other then that is extremely painful (especially if there is no pip package available).

Data Ingestion - 1

Data is stored in Google Cloud Storage. Data ingestion is unusable. The documented way to upload/download data is via gsutil cp. If you want to upload a large dataset you should use gsutil -m cp -r, which is just multithreading analog and is EXTREMELY SLOW (and posix incompatible btw). I googled for some time and did not find any solutions for that.

Different google frameworks can work transparently with google cloud storage, which is VERY useful and convenient. But the impossibility to correctly mount google cloud storage folder to your AI training job and as a consequence downloading full dataset to the job machine every time is a large showstopper for research.

AI starter kits - 3

Large repository with a lot of different examples on the different topics of Machine Learning (with sklearn) and Deep Learning (with tensorflow and keras).

Collaboration - 3

Google Cloud storage is pretty powerful tool for this and I believe you can achieve all the collaboration goals with it.

Bring your own cloud - 1

I don’t think that it is possible.

Enterprise-ready - 3

I think, that the whole idea of the Google AI Platform is about Enterprise and not about research.

Final score - 14