After experimenting with some threading, I came to the conclusion that the fastest way to migrate the files was to iterate over the directories, and for each directory, start a console application that migrated the 50,000 files over to the content platform. I could "parallelize" the solution by starting more instances of the console application.
My first thought was to use Powershell. I experimented a bit, and all of my Powershell solutions seemed rather clumsy. Powershell is not my strong suit. So I abandoned the effort and decided to write a console application to start the console application.
It was important for me to be able to configure how many instances of the processes to start. And I didn't want to start more than the configured amount. My first solution was better than powershell, but still rather clumsy. It involved a task list. I would start the console application in a C# Process that was part of a List<Task>. After all of the tasks were complete (Task.WaitAll()), another batch of processes would begin. The problem is that some of the tasks finish a little faster and I discovered I was wasting some time waiting for all of the tasks to complete. Here some code:
while (directories.Count > 0) { var miniDirs = directories.Take(threadsToStart).ToList(); foreach(var dir in miniDirs) { Console.WriteLine(dir); tasks.Add(Task.Run(delegate { ProcessStartInfo pInfo = new ProcessStartInfo(); pInfo.FileName = "Process.exe"; pInfo.Arguments = volume + " " + dir.Split('\\')[5]; Process.Start(pInfo).WaitForExit(); })); directories.Remove(dir); } Task.WaitAll(tasks.ToArray()); tasks = new List<Task>(); }
Not elegant. I needed some computer sciency stuff. Semaphore to the rescue (actually SemaphoreSlim https://msdn.microsoft.com/en-us/library/system.threading.semaphoreslim(v=vs.110).aspx). Recalling that Semaphore wait and release threads, I decided to move my Process.Start code into a method and start that method in a thread. The method would wait for the process to finish
WaitForExit()and then release the semaphore allowing another thread to start.
private static void StartProcess(string volume, string dir) { semaphore.Wait(); ProcessStartInfo pInfo = new ProcessStartInfo(); pInfo.FileName = @"Application.exe"; pInfo.Arguments = volume + " " + dir.Split('\\')[5]; Process.Start(pInfo).WaitForExit(); semaphore.Release(); }
I could start each thread in a foreach loop and be sure that only the number of threads I used in the SemaphoreSlim constructor would be started. Here's the loop:
foreach (var directory in directories) { Thread thread = new Thread(() => StartProcess(volume, directory)); thread.Start(); }Much more elegant and a thread will start when any of the processes complete. I probably could have just asked Ethan Frei and saved myself some time!
No comments:
Post a Comment