This is actually an old post that I wrote a while ago but I didn’t publish it.
I was setting frustrated going through 52 open tabs on my browser looking for some help for my graduation project “maybe you bit off more than you can chew huh” I thought to myself, it seemed to me that this project is never going to be finished. but in that moment of despair it came to me, as I was going through the resources tutorials, lectures, and code notebooks I’ve realized how much I’m fascinated by the idea of computer vision. This made things little bit less painful for me because this is something I really want to be good at and since I’m having troubles that meant at least I’m learning something new. It’s weird that it wasn’t until final weeks that I realized I picked up something might actually be my career
Computer vision -as the name implies- is a computer science discipline that deals with how computers can gain high-level understanding from digital images or videos or any other visual input. People think it’s subfield of A.I, but Computer Vison had been around for years before the explotion of AI and many methods were developed solve its problems without AI. It just happens that since the introduction of Deep learning and CNN “convolutional neural networks” they’ve been outperforming all the other method. Now with the availability of data and computation power to process it, the field is tackling more and more interesting problems.
The idea of computers being able to identify,classify and segment objects form images or videos which fundamentally makes them “see” just like us is actually both scary and amazing. We are yet to achieve that, Tesla (which is considered a poineer in the field) is yet to deliver a full automated self-driving cars dispite claims from elon musk. Self-driving cars are cool but they are not the only application to Computer Vison, diseases detection (as in my graduation project), object character recognition (OCR), disasters early warning systems (This was actully my first idea for my graduation project) are some highlights of this field.
Okay, so now you probably thinking: “well, that’s cool, but what’s your plan?”. “Are this application implementable here in Sudan?”. “Is this a reasonable career choice ?”. To be honest I haven’t made a detailed plan yet but I’m working on it, but here is the big ideas.
mentors are so important I wish I knew this earlier they can help you save so much time and effort and learn from their experience. About a year ago I was a part of a machine learning mentorship program and I had an exceptional mentor who had incredible experience in Computer vison, but unfortunately we had to stop the mentorship program about two weeks in. I haven’t made contact with him since, which is a shame really because I think I should have benifited more from him, however, I’m currently searching for someone with experinece in computer vision (tell me if you know someone ).
If I want to be great at computer vision there’s a lot of research papers that I should go through. I’ll have to study hard (almost) daily just to keep up. I’m starting with the basics. I’m making a list of the essential papers to read in the fieald of computer vison including papers on (CNN archetictrues, transformers, GANs, etc). Also computer Vison in closely related to robotic and reinforcement learning especially in the context of self-driving cars so I have to keep up with that too. I have also started looking at comma.ai’s openpiolt code (for these of you who don’t know comma.ai is a company that is competing with telsa think of it this way: if tesla was apple for self-driving cars then comma is Android because is it is open sourced. Maybe I’ll make a post comparing the two in details ) at first it seemed it very complicated, but thanks to thier own blog I was able to get the jest of it. I’m trying to create my own master dgree curiclumm there’s some courses (all of which are freely availble in youtube ) three from univercity of tuebingen and other from NYU.
Along side the theoritical part there must also be practical work to cement this ideas. Kaggle is all you can ask for beginner to advanced the amount of knowledge you can get from is absolutely ridiculous and with their free GPUs and TPUs you can actually work with very complex and interesting data without worrying about hardware/software issues. I’m aware of the big difference between doing data science and machine learning in kaggle vs real life. there is a very important steps in the whole process that kaggle makes invisable for you, Now while being aware of this defferences is important it’s also important to remember that in real life you work in teams in you probably wont find yourself worrying about everything on a real job.
One of the most important steps. talking is easy, show your results. I’m learn how to deploy models and embed them on application on different platforms. My web develompent skills will come handy here, I’m thinking of adding a projects page on this blog where I share my work with you.
networking is probabaly the hardest thing on this list. Being an introvert in a world where your network and connection can get further than your talent or skills can ever do, it sometimes depressing, but I will have to learn to play by the rules, maybe the reason I hate working with other people is that I never did I might actually like it who knows,
I think there’s a big resistance to technology in general and rightfully so. If you ever used bankak which is a digital banking app for one of Sudan largest banks (I think it has the most clients) but it is so full of bugs, laggy and lacking the essence of digital banking which is spontaneousness. That’s for an arguably very simple and common use case so you could imagine the reaction to computer vision application (it is easier to convince some one that an application can calculate your trancsactions, but to tell them it can diagnose your medical images huh no way, and that of course given that we have a sophisticated enough AI to really diagnose ) It’s not the all bankak fault though, the overall infrastructure of telecommunications in this country is questionable not to mention things like syber securit. ..etc.But still it’s not completely hopeless yet (let’s hope I’m not wrong) How can we re-gain Sudanese people trust in technology that’s an idea for another post for another time. Also people always seem to undermine the influnce AI in general can have on sudanese people life because they have a very rigid and specific ideas of how in can be used. It doesn’t have to be complex to be useful technology is measured by the value it creates in people’s life not by how sophisticated it is.
what about you? what do you think about Computer Vision and do you have an idea about an application that utilize it ? what do you want your computer to see?.