This guide offers you an overview of the concepts that an augmented reality (AR) developer candidate must have mastered. Although it might seem like a straightforward task, the constant innovations of this recent field of knowledge can make hiring AR developers a real challenge.
Considering that AR development can work by adding any combination of several forms of virtual elements to a real-world environment, you need to first define the category (or categories) that fit your project’s needs.
Marker-based Augmented Reality
As the name suggests, marker-based AR technology uses markers that serve as key points in a scene to enhance an image of the real world. The markers are usually made of a black and white image represented in a QR/2D code. They are given to the application as input making it easier to detect them when acquiring camera frames.
AR developer tasks in this particular field might include:
- Calibrating the camera
- Selecting a location that makes the task of finding the pattern on the markers easy
- Ensuring that the least amount of shaking happens to the virtual objects when moving the camera around
Markerless Augmented Reality
Markerless AR image from David Malan, Getty Images
In the markerless approach, the strategy is totally different. Without markers, the data used in the application comes from other sources such as a GPS, a digital compass, or an accelerometer. As one may notice, since most smartphones are shipped with these sources, markerless AR technology is a perfect fit for mobile AR apps.
Projection-based Augmented Reality
Projection-based augmented reality consists of projecting an image onto a physical 3D model, in order to make it more realistic. It can be used to detect human interaction with the projected light. For a practical example, this could even be used to facilitate complex manual tasks on an assembly line.
Superimposition-based Augmented Reality
IKEA's product visualization app uses superimposition-based AR.
We can also detect objects in a scene and change/augment them partially or totally. This is the premise of superimposition-based AR. It serves as a great alternative for applications that focus on enhancing the user experience with virtual customization of an object. For instance, the user could try out different virtual furniture positioned in a room.
Having selected the category where your project fits in, it is time now to determine the desirable AR developer skills according to your needs. The augmented reality developer is responsible not only for the code, but it is also expected to design assets and define or improve interfaces between your application and the users.
Augmenting a Real-world Environment
When hiring an augmented reality developer, the candidate should be confident in describing the steps behind augmenting a real-world environment. You can select two scenarios, one using markers and another one without them, and let them describe the challenges and solutions:
Explain, within the given scenarios, how would you augment the real world.
For the markers scenario, one challenge would be segmenting the markers from the rest of the scene. Your AR developer candidate should know these steps: We need to prepare the image, get points and regions of interest, and then check if the detected regions (if any) have the pattern we are looking for.
To prepare the initial image, we apply a threshold so it becomes a binary image (0 or 1 values for each pixel). After that, we need to detect a feature in the markers to serve as a reference object. One common way of doing this is to use the corners of a square (with a QR/2D pattern) in the marker to serve as reference points.
There are many algorithms to do this matching on the scene. One example is the Harris Corner Detector algorithm. Following a mathematical approach, this algorithm can determine if a region is flat, an edge, or a corner. We would be looking for the corners and edges as they will be used in the next step. Specifically, we want to outline contours that can be fitted by four line segments forming squares/quadrangles.
With that information in hand, it’s time to make local descriptors for the extracted regions. They will be compared to the descriptors in the database for the patterns we provided to the application. If we have a positive match, a marker was found, and that 3D coordinate can be used to augment the scene with virtual objects.
For the marker-less scenario, besides the challenge of extracting desired features, there is also no information beforehand (no pattern comparison). One possible way to get these features is to extract screen patches that remain the same, regardless of changing viewpoint and/or illumination conditions.
We can extract these regions based on different characteristics, for instance, intensity and geometry. For geometry regions, a patch would exploit corners and edges (could use the same Harris algorithm). For intensity, we would select regions with local extrema in the image. Think of a center point of intensity extrema, and from that point, we would cast rays in every direction. An invariant function is used to define the region boundary.
The extracted regions will be used to place information on the screen. There is no comparison with the database this time. This might seem like an easier task, but to define good regions without the help of a distinct marker is a real challenge. Any part of the described steps can be discussed with the candidate to grasp their knowledge of this important subject.
No matter what category you pick, some level of image processing might be required. You can think of this processing as an attempt to modify an image so it is easier to extract the information we need. A processing technique example can be seen in the image above. In order to get the finer details of the original image, we apply a filter to it—a high-pass filter in this case.
There are several challenges in processing images to be used as input. You can have issues with colors, types of processing, filtering, varying illumination, shadows, etc. It really depends on the camera and the application.
The AR developer must be comfortable in doing analysis or applying algorithms to digital images. Since this is a very broad subject, the important thing is to try and grasp the breadth of knowledge of the developer.
Briefly describe the challenges related to image processing when developing an augmented reality application. You can use any related fields of knowledge as examples.
Since image processing is a broad field of research, this is more of a general question. The answer should at least cover one major topic of concern. This includes but is not limited to:
- Efficient encoding of images and video sequences
- Image acquisition, enhancement, restoration, and segmentation
- Color processing, classification, and recognition in images
It’s more important to get a sense of the breadth of knowledge the AR developer has than to measure their specific knowledge in this case.
Tracking is used to get the user’s viewpoint or the camera position and orientation. This represents a real challenge and it is still an open research problem.
Describe the approaches to doing object tracking in an augmented reality application.
We can separate the different approaches to tracking into three fields, namely sensor-based, vision-based, and hybrid tracking.
As the name suggests, this type of tracking uses sensors such as accelerometer, mechanical gyroscope (these two are often collectively referred to as inertial tracking), GPS, magnetic compass, etc. They have the advantage of being good at predicting motion when fast changes occur.
This approach uses image processing methods to calculate the camera pose. Usually, they can be further subdivided into model- or feature-based vision tracking.
For model-based tracking, a model of an object of the scene would be the reference for the tracking system. The 3D model needs to be provided beforehand to the application. It can be a previously known 3D model or a model that was reconstructed from extracted scene features. In feature-based tracking, the references would be markers placed in the scene or natural features in the image.
This last category of tracking methods will be used when the others alone can’t handle the job in a desirable manner. In this case, the AR developer might need to combine different approaches to make the tracking more robust.
Imagine a situation where the conditions for the scene are not considered optimal. A user varies the scene conditions abruptly by rotating their smartphone when filming an urban scene with augmented information in labels. Using only vision-based tracking, it would be really hard to maintain robustness and accuracy of the application when displaying information on the scene.
This changes when more information is used, for instance, coming from the inertial tracking of a 3D gyroscope. Having a global orientation measured by the gyroscope can used to position the labels in the new scene in a precise manner, not having to depend only in the visual scene differences between frames—differences which would be huge in the scenario we just described.
Specific Tracking Knowledge
Besides asking about the AR developer’s general knowledge in this subject, it’s also important to ask about specific challenges related to tracking. These may include illumination problems, occlusion, clutter, dynamic background, camera motion, the presence of shadows, etc.
Tell me how you would deal with a dynamic background when tracking an object in a scene.
The most common method to differentiate foreground objects from the rest of the scene is called background subtraction. The basic technique consists of subtracting the previous frame from the current frame and thresholding the result on each pixel.
This method may solve the problem right away with a static background, but for a dynamic one, we might need something more robust. Consider, for instance, a camera filming at different parts of the day, or different weather conditions, or even noise when acquiring the images from the camera. These factors need to be considered and the basic method needs to be improved.
One possible solution is to add an extra step before subtracting the current and previous frames. This step will be responsible for classifying each pixel as background or foreground. An example algorithm is the “Fuzzy C-means clustering” (FCM) algorithm.
The general process is to acquire a video, separate it in frames, convert the frames to grayscale, detect the desired features (edge detection), classify each pixel (extra step), and then subtract the previous frame from the current one. This will output the image without its background.
It’s also important to know if the candidate has any background related to specific application requirements.
An example of this would be, for instance, an application that relies on facial recognition to put extra information on the screen. It would be interesting to evaluate candidates by their breadth of knowledge in algorithms related to detecting facial recognition.
Another common example would be a smartphone application that adds virtual objects to a street being filmed. In this scenario, it would be interesting to know if the AR developer has any background in enhancing images and/or recognizing streets using morphological algorithms.
You can ask the developer how they would build a certain application idea. For each initial requirement, try to ask for alternatives and the reasons why. Is the camera good enough? Is the device a problem? Do we need a specific asset? Is there any framework that fits best in this scenario?
Augmented Reality Frameworks
When dealing with an AR app project, we are not going to ask an augmented reality developer to build everything from scratch. There are several frameworks out there to get most of the basic stuff out of the way, letting you build things on top of it.
At this point, we already asked the developer about how to do things, and now it’s time to ask if they actually did it and what they used to achieve the tasks given.
Compare the augmented reality frameworks you know about, including advantages and disadvantages.
If you search the web, there are dozens of AR frameworks. You can browse a thorough comparison of features, but we are not going to pinpoint a single framework as the “best” one right from the get-go.
What’s interesting here is to know which framework suits your needs most closely. To find that framework, we are going to assume a list of features required by an example application and choose a framework with the best adherence from the list. Here, the developer’s experience can also make a difference when making this choice.
Our example application is going to target Unity (3D), and we need it to be a free or open-source framework to start with. There are four known candidates that pop out from the list in this first query, namely: ARToolkit, Vuforia, Wikitude, and EasyAR.
Describe the applications you have developed in the past. What framework(s) did you use?
These questions are related, and the objective here is to grasp the breadth of knowledge of the AR developer. Simply put, the more the better. We’ll be able to know if the developer has significant experience implementing augmented reality applications and if they are in tune with the newest changes in this field.
Pay Attention to Other Relevant Experience
Besides specific augmented reality knowledge, other types of previous experience might also be very useful and should be considered when hiring an AR developer.
For instance, keep an eye out for developers with good experience in 3D environments, even if completely virtual ones.
Likewise, having people with great video/sound production skills on your project can really improve your users’ experiences.
Game developers should be included in this consideration also. They create, in a sense, new worlds for us to exist in. Developing a full game, or even a game engine, promises knowledge in several topics that interest us. This includes cameras, textures, lighting, UI/UX, etc.
The bottom line is, it’s important to be able to abstract other developer skills and apply them to your project scenario.
Back to the Real World
This guide aimed to give you the tools and general knowledge to hire a solid augmented reality developer. But a guide is a guide: The process of finding the “correct” candidate will still remain a challenge and require your best discretion. It is important to value many other aspects besides technical knowledge.
If you are looking for new ideas related to AR technology, a good way to search for them is to attend (or read the papers published in) conferences and symposiums related to this subject, like ISMAR and IEEE Virtual Reality.
The questions and answers presented here were made in such a way that it’s easy to abstract them into different problems or specialize them for specific situations. Ultimately, as the interviewer, only you can find the right mix of interview elements for the job you’re trying to fill. You know your AR project—trust your knowledge and your instincts, and you’ll know the ideal candidate when you find them.