if we have a dataset of 2000GB of video data and a model of 200 layers what is the best approach i will follow to make the traning time efficient. Can anyone answer? (they further told me that they have 6 GPU)
distributed computing? parallelism?
for multiple gpu stuff see chainer.org
Обсуждают сегодня