set of data ?
There is a large csv file - First stage of the pipeline reads data line by line and pushes them into a channel - second stage of the pipeline reads data from the previous channel and filters invalid data, then pushes the valid data into another channel - third stage, reads data from the previous channel and calculates some stuff based on them, then writes the results into an output file
If you need to have an order if I understand correctly, if the order is the lines you are reading in your scan, using goroutines will not help you because the problem needs to be resolved synchronously 😔😔
I'm not sure I have completely understood your problem but if the order is the line of the csv file you can pass to the go routine the line and the index. The go routine will return the result and the index. Who receive the result will put it in a map (index<->result). This is valid for both second and third stage.
Yeah that's what I came up with. I added some buffered channels for the pipeline stage communication so I can have a little bit of concurrency.
But if the whole process is synchronous, there is no need to use simultaneity. But if it is possible to use everything asynchronously, the conversation is different.
There might be some way to tranform the business logic into something more independent, to keep it async. For example, batching each set of records with the same ID. This way it could be more async
Обсуждают сегодня