A Beginner’s Guide to Segmentation in Satellite Images: Walking through machine learning techniques for image segmentation and applying them to satellite imagery

A Beginner’s Guide to Segmentation in Satellite Images: Walking through machine learning techniques for image segmentation and applying them to satellite imagery

Blog from: https://www.gsitechnology.com/

In my first blog, I walked through the process of acquiring and doing basic change analysis on satellite data. In this post, I’ll be discussing image segmentation techniques for satellite data and using a pre-trained neural network from the SpaceNet 6 challenge to test an implementation out myself.

What is image segmentation?

As opposed to image classification, in which an entire image is classified according to a label, image segmentation involves detecting and classifying individual objects within the image. Additionally, segmentation differs from object detection in that it works at the pixel level to determine the contours of objects within an image.

 

Image for post

Source

In the case of satellite imagery, these objects may be buildings, roads, cars, or trees, for example. Applications of this type of aerial imagery labeling are widespread, from analyzing traffic to monitoring environmental changes taking place due to global warming.

 

Image for post

Source

The SpaceNet project’s SpaceNet 6 challenge, which ran from March through May 2020, was centered on using machine learning techniques to extract building footprints from satellite images—a fairly straightforward problem statement for an image segmentation task. Given this, the challenge provides us with a good starting point from which we can begin to build understanding of what is an inherently advanced process.

I’ll be exploring approaches taken to the SpaceNet 6 challenge later in the post, but first, let’s explore a few of the fundamental building blocks of machine learning techniques for image segmentation to uncover how code can be used to detect objects in this way.

Convolutional Neural Networks (CNNs)

You’re likely familiar with CNNs and their association with computer vision tasks, particularly with image classification. Let’s take a look at how CNNs work for classification before getting into the more complex task of segmentation.

As you may know, CNNs work by sliding (i.e. convolving) rectangular “filters” over an image. Each filter has different weights and thus gets trained to recognize a particular feature of an image. The more filters a network has—or the deeper a network is—the more features it can extract from an image and thus the more complex patterns it can learn for the purpose of informing its final classification decision. However, given that each filter is represented by a set of weights to be learned, having lots of filters of the same size as the original input image makes training a model quite computationally expensive. It’s largely for this reason that filters typically decrease in size over the course of a network, while also increasing in number such that fine-grained features can be learned. Below is an example of what the architecture for an image classification task might look like:

 

Image for post

Source

As we can see, the output of the network is a single prediction for a class label, but what would the output be for a segmentation task, in which an image may contain objects of multiple classes in different locations? Well, in such a case, we want our network to produce a pixel-wise map of classifications like the following:

 

Image for post

An image and its corresponding simplified segmentation map of pixel class labels. Source

To generate this, our network has a one-hot-encoded output channel for each of the possible class labels:

 

Image for post

Source

These maps are then collapsed into one by taking the argmax at each pixel position.

The tricky part of achieving this segmentation is that the output has to be aligned with the input image—we can’t follow the exact same downsampling architecture that we use in a classification task to promote computational efficiency because the size and locality of the class areas must be preserved. The network also needs to be sufficiently deep to learn detailed enough representations of each of the classes such that it can distinguish between them. One of the most popular kinds of architecture for meeting these demands is what is known as a Fully Convolutional Network.

Fully Convolutional Networks (FCNs)

FCN’s get their name from the fact that they contain no fully-connected layers, that is, they are fully convolutional. This structure was first proposed by Long et al. in a 2014 paper, which I aim to summarize key points of here.

With standard CNNs, such as those used in image classification, the first layer of the network is fully-connected, meaning it has the same dimensions as the input image; this means that the size of the first layer must be fixed to align with the input image. Not only does this render the network inflexible to inputs of different sizes, it also means that the network uses global information (i.e. information from the entire image) to make its classification decision, which does not make sense in the context of image segmentation in which our goal is to assign different class labels to different regions of the image. Convolutional layers, on the other hand, are smaller than the input image so that they can slide over it—they operate on local input regions.

In short, FCNs replace the fully-connected layers of standard CNNs with convolutional layers with large receptive fields. The following figure illustrates this process. We see how a standard CNN for classification of a cat-only image can be transformed to output a heatmap for localizing the cat in the context of a larger image:

 

Image for post

Source

Moving through the network, we can see that the size of the layers getting smaller and smaller for the sake of learning finer features in a computationally efficient manner—a process known as “downsampling.” Additionally, we notice that the cat heatmap is of coarser resolution than the input image. Given these factors, how does the coarse feature map get translated back to the size of the input image at a high enough resolution such that the pixel classifications are meaningful? Long et al. used what is known as learned upsampling to expand the feature map back to the same size as the input image and a process they refer to as “skip layer fusion” to increase its resolution. Let’s take a closer look at these techniques.

Demystifying Learnable Upsampling

Prior approaches to upsampling relied on hard-coded interpolation methods, but Long et al. proposed a technique that uses transpose convolution to upsample small feature maps in a learnable way. Recall the way that normal convolution works:

 

Image for post

Source

The filter represented by the shaded area slides over the blue input feature map, computing dot products at each position to be recorded in the green output feature map. The weights of the filter are what is being learned by the network during training.

Transpose convolution works differently: the filter’s weights are all multiplied by the scalar value of the input pixel it is positioned over, and these values get projected to the output feature map. Where filter projections in the output map overlap, their values are added.

 

Image for post

Source

Long et al. use this technique to upsample the feature map rendered by network’s downsampling layers in order to translate its coarse output back to pixels that align with those of the input image, such that the network’s architecture looks like this:

 

Image for post

An example of a final upsampling layer appended to the downsampling path to render a full-sized segmentation map. Note that the final feature map has 21 channels, representing the number of classes for the particular segmentation challenge being explored in the paper. Source

However, simply adding one of these transpose convolutional layers at the end of the downsampling layers yields spatially imprecise results, as the large stride required to make the output size match the input’s (32 pixels, in this case) limits the scale of detail the upsampling can achieve:

 

Image for post

The upsampled segmentation map (left) is appropriately scaled to the input image but lacks spatial precision. Source

Luckily, this lack of spatial precision can be somewhat mitigated by “fusing” information from layers with different strides, as we’ll now discuss.

Skip Layer Fusion

As previously mentioned, a network must be deep enough to learn detailed features such that it can make faithful classification predictions; however, zeroing in closely on any one part of an image comes at the cost of losing spatial context of the image as a whole, making it harder to localize your classification decision in the process of zooming back out. This is the inherent tension at play in image segmentation tasks, and one that Long et al. work to resolve using skip connections.

In neural networks, a skip connection is a fusion between non-adjacent layers; in this case, skip connections are used to transfer local information by summing feature maps from the downsampling path with feature maps from the upsampling path. Intuitively, this makes sense: with each step we take through the downsampling path of the network, global information gets lost as we zoom into a particular area of the image and the feature maps get coarser, but once we have gone sufficiently deep to make an accurate prediction, we wish to zoom back and localize it, which we can do utilizing information stored in the higher resolution feature maps from the downsampling path of the network. Let’s take a more in depth look at this process by referencing the architecture Long et al. use in their paper:

 

Image for post

Visualization of skip connections (left arrows) in a network and their effect on the granularity of resulting segmentation maps. Source (modified)

Across the top of the image is the network’s downsampling path, which we can see follows a pattern of two or three convolutions followed by a pooling layer. conv7 represents the coarse feature map generated at the end of the downsampling path, akin to the cat heatmap we saw earlier. The “32x upsampled prediction” is the result of the first architecture without any skip connections, accomplishing all of the necessary upsampling with a single transpose convolutional layer of a 32 pixel stride.

Let’s walk through the “FCN-16s” architecture, which involves one skip connection (see the second row of the diagram). Though it is not visualized, a 1×1 convolution layer is added on top of the “pool4” feature map to produce class predictions for all its pixels. But the network does not end there—it proceeds to downsample by a factor of 2 once more to produce the “conv7” class prediction map. Since the conv7 map is of half the dimensionality of the pool4 map, it is upsampled by a factor of 2 and its predictions are added to those of the pool4, producing a combined prediction map. This result is upsampled via a transpose convolution with a stride of 16 to yield the final “FCN-16s” segmentation map, which we can see achieves better spatial resolution than the FCN-32s map. Thus, although the conv7 predictions experience the same amount of upsampling in the end as in the FCN-32s architecture (given that 2x upsampling followed by 16x upsampling = 32x upsampling), factoring the predictions from the pool4 layer improves the result greatly. This is because pool4 reintroduces valuable spatial information from the input image into the equation—information that otherwise gets lost in the additional downsampling operation for producing conv7. Looking at the diagram, we can see that the “FCN-8s” architecture follows a similar process, but this time a skip connection is also added from the “pool3” layer, which we see yields an even higher fidelity segmentation map.

FCNs—Where to go from here?

FCNs were a big step in semantic segmentation for their ability to factor in both deep, semantic information and fine, appearance information to make accurate predictions via an “encoding and decoding” approach. But the original architecture proposed by Long et al. still falls short of ideal. For one, it results in somewhat poor resolution at segmentation boundaries due to loss of information in the downsampling process. Additionally, overlapping outputs of the transpose convolution operation discussed earlier can cause undesirable checkerboard-like patterns in the segmentation map, which we see an example of below:

 

Image for post

Criss-crossing patterns in a segmentation heatmap resulting from overlapping transpose convolution outputs. Source

Many models have built upon the promising baseline FCN architecture, seeking to iron out its shortcomings, “U-net” being a particularly notable iteration.

U-Net—An Optimized FCN

U-net was first proposed in a 2015 paper as an FCN model for use in biomedical image segmentation. As the paper’s abstract states, “The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization,” yielding a u-shaped architecture that looks like this:

 

Image for post

An implementation of the U-net architecture. Numbers at the top of the feature maps denote their number of channels, numbers at the bottom left denote their x-y-size. The white feature maps represent are copies from the downsampling path, which we can see get concatenated to feature maps in the upsampling path. Source

We can see that the network involves 4 skip connections—after each transpose convolution (or “up-conv”) in the upsampling path, the resulting feature map gets concatenated with one from the downsampling path. Additionally, we see that the feature maps in the upsampling path have a larger number of channels than in the baseline FCN architecture for the purpose of passing more context information to higher resolution layers.

U-net also achieves better resolution at segmentation boundaries by pre-computing a pixel-wise weight map for each training instance. The function used to compute the map places higher weights on pixels along segmentation boundaries. These weights are then factored into the training loss function such that boundary pixels are given higher priority for being classified correctly.

We can see that the original U-net architecture yields quite fine-grained results in its cellular segmentation tasks:

 

Image for post

U-net segmentations results in images and d, with ground truth boundaries outlined in yellow. Source

The development of U-net yet was another milestone in the field of computer vision, and five years later, models continue to expound upon its u-shaped architecture to achieve better and better results. U-net lends itself well to satellite imagery segmentation, which we will circle back to soon in the context of the SpaceNet 6 challenge.

Further Developments in Image Segmentation

We’ve now walked through an evolution of a few basic image segmentation concepts—of course, only scratching the surface of a topic at the center of a vast, rapidly evolving field of research. Here is a list of a few other interesting image segmentation concepts and applications, with links should you wish to explore them further:

  • Instance segmentation is a hybrid of object detection and image segmentation in which pixels are not only classified according to the class they belong to, but individual objects within these classes are also extracted, which is useful when it comes to counting objects, for example.
  • Techniques for image segmentation extend to video segmentation as well; for example, Google AI uses an “hourglass segmentation network architecture” inspired by U-net for real-time foreground-background separation in YouTube stories.
  • Clothing image segmentation has been used to help retailers match catalogue items with physical items in warehouses for more efficient inventory management.
  • Segmentation can be applied to 3D volumetric imagery as well, which is particular useful in medical applications; for example, research has been done on using it to monitor the development of brain lesions in stroke patients.
  • Many tools and packages have been developed to make image segmentation accessible to people of various skill levels. For instance, here is an example that uses Python’s PixelLib library to achieve 150-class segmentation with just 5 lines of code.

Now, let’s walk through actually implementing a segmentation network ourselves using satellite images and a pre-trained model from the SpaceNet 6 challenge.

The SpaceNet 6 Challenge

The task outlined by the SpaceNet challenge is to use computer vision to automatically extract building footprints from satellite images in the form of vector polygons (as opposed to pixel maps). In the challenge, predictions generated by a model are determined viable or not by calculating their intersection over union with ground truth footprints. The model’s f1 score over all the test images is calculated according to these determinations, serving as the metric for the competition.

The training dataset consists of a mix of mostly synthetic aperture radar (SAR) and a few electro-optical (EO) 0.5m resolution satellite images collected by Capella Space over Rotterdam, the Netherlands. The testing dataset contains only SAR images (for further explanation on SAR imagery, take a look at my last blog). The dataset being structured in this way makes the challenge particularly relevant to real-world applications, as SpaceNet explains, it is meant to “mimic real-world scenarios where historical optical data may be available, but concurrent optical collection with SAR is often not possible due to inconsistent orbits of the sensors, or cloud cover that will render the optical data unusable.”

 

Image for post

An example of a SAR image from the SpaceNet 6 dataset, with building footprint annotations shown in red. Source

More information on the dataset, including instructions for downloading it, can be found here. Additionally, SpaceNet released a baseline model, for which they provide explanation and code. Let’s explore the architecture of this model before implementing it to make predictions ourselves.

The Baseline Model

 

Image for post

TernausNet architecture. Source

The architecture SpaceNet uses as its baseline is called TernausNet, a variant of U-Net with a VGG11 encoder. VGG is a family of CNNs, VGG11 being one with 11 layers. TernausNet uses a slightly modified version of VGG11 as its encoder (i.e. downsampling path). The network’s upsampling path mirrors its downsampling path, with 5 skip connections linking the two. TernausNet improves upon U-Net’s performance by initializing the network with weights that were pre-trained on Kaggle’s Carvana dataset. Using a model pre-trained on other data can reduce training time and overfitting—an approach known as transfer learning. In fact, SpaceNet’s baseline takes advantage of transfer learning again by first training on only the optical portion of the training dataset, then using the weights it finds through this process as the initial weights in its final training pass on the SAR data.

Even with these applications of transfer learning, though, training the model on roughly 24,000 images is still a very time intensive process. Luckily, SpaceNet provides the weights for the model at its highest scoring epoch, which allow us to get the model up and running fairly easily.

Making Predictions from the Baseline Model

Step-by-step instructions for deploying the baseline model can be found in this blog. In short, the process involves spinning up an AWS Elastic Cloud Compute (EC2) instance to gain access to GPUs for more timely computation and loading the challenge’s Amazon Machine Image (AMI), which is pre-loaded with the software, baseline model and dataset. Keep in mind that the dataset is very large, so downloads may take some time.

Once your downloads are complete, you can find the PyTorch code defining the baseline model in model.pybaseline.py takes care of image preprocessing and running training and testing operations. The weights of the pre-trained model with the best scoring epoch are found in the weights folder and are loaded when test.sh is run.

When we run an image through the model, it outputs a series of coordinates that define the boundaries of the building footprints we are looking to find as well as a mask on which these footprints are plotted. Let’s walk through the process of visualizing an image and its mask side-by-side to get a sense of how effective the baseline model is at extracting building footprints. Code for producing the following visualizations can be found here.

Getting a coherent visual representation of the SAR data is somewhat trickier than expected. This is because each pixel in a given image is assigned 4 values, corresponding to 4 polarizations of data in the X-band of the electromagnetic spectrum—HH, HV, VH and VV. In short, signals transmitted and received from a SAR sensor come in both horizontal and vertical polarization states, so each channel corresponds to a different combination of the transmitted and received signal types. These 4 channels don’t translate to the 3 RGB channels we expect for rendering a typical image. Here’s what it looks like when we select the channels one-by-one and visualize them in grayscale:

 

Image for post

Visual representations of the 4 polarizations of a single SAR image. Image by author

Notice that each of the 4 polarizations captures a slightly different representation of the same area of land. We can combine these representations to produce a single-channel span image to plot alongside the building footprint mask the model generated, which we convert to binary to make the boundaries more clear. With this, we can see that the baseline model did recognize the general shapes of several buildings:

 

Image for post

A visualization of the combined spectral bands of a SAR test image and the corresponding building footprint mask generated by the baseline model. Image by author

It is pretty cool to see the basic structures we’ve discussed in this post in action here, producing viable image segmentation results. But, it’s also clear that there is room for improvement upon this baseline architecture—indeed, it only achieves an f1 score of 0.21 on the test set.

Conclusion

The SpaceNet 6 challenge wrapped up in May, with the winning submission achieving an f1 score of 0.42—double that of the baseline model. More details on the outcomes of the challenge can be found here. Notably, all of the top 5 submissions implemented some variant of U-Net, an architecture that we now have a decent understanding of. SpaceNet will be releasing these highest performing models on GitHub in the near future and I look forward to trying them out on time series data to do some exploration with change detection in a future post.

Lastly, I’m very thankful for the thorough and timely assistance I received from Capella Space for writing this—their insight into the intricacies of SAR data as well as recommendations and code for processing it were integral to this post.

References

 

LIDAR – Shaping the future of automotive

LIDAR plays a major role in automotive, as vehicles perform tasks with less and less human supervision and intervention. As a leader in VCSEL, ams is helping to shape this revolution.

LIDAR (Light Detection and Ranging) is an optical sensing technology that measures the distance to other objects. It is currently known for many diverse applications in industrial, surveying, and aerospace, but is a true enabler for autonomous driving. As the automotive manufacturers continue their push to design and release high-complexity autonomous systems, we likewise develop the technology that will enable this. That is why ams continues to bring our high-power VCSELs to the automotive market and to test the limits on peak power, shorter pulses, and additional scanning features which enable our customers to improve their LIDAR systems.

In 2019, ams together with ZF and Ibeo announced a hybrid solution called True Solid State where, like flash technology, no moving parts are needed to capture the full scene around the vehicle. By sequentially powering a portion of the laser, a scanning pattern can be generated, combining the advantages of flash and scan systems.

Making sense of the LIDAR landscape

At ams, we classify LIDAR systems on seven elements: ranging principle, wavelength, beam steering principle, emitter technology and layout, and receiver technology and layout. Here we discuss the first five.

The most dominant implementation to measure distance (ranging) is Direct Time of Flight (DTOF): a short (few nanoseconds) laser pulse is emitted, reflected by an object and returned to a receiver. The time difference between sending and receiving can be converted into a distance measurement. Moreover, with duty cycles of <1% this system takes thousands of distance measurements per second. The laser pulse is typically in the 850-940nm rage, components are readily available and most affordable. However, systems can also be using 1300 or 1550nm, the big advantage is eye safety regulations allow more energy to be used here, and in theory, this provides more range. The downside is that components are expensive.

To scan the complete surroundings (or field of view) of a vehicle, the system needs to be able to shoot pulses in all directions. This is the beam steering principle. Classical systems used rotating sensor heads and mirrors to scan the field of view section by section. As these systems are bulky, they are being replaced by static systems with internal moving mirrors. MEMS mirrors are also about to enter the market. Another approach is flash, where no moving parts are needed at all. The light source illuminates the complete field of view, and the sensor captures that same field in a single frame like a photo. As the full scene is illuminated, and to remain eye safe, this means the range must be limited.

On the emitter side, edge emitters continue to be frequently used, based on earlier developments. They have a high-power density, making them suitable in combination with MEMS mirrors. Where first iterations were single emitters, meanwhile 2-4-8-16 emitters are being integrated in a single bar. Fiber lasers are another interesting technology. They offer even higher power density, and typically are used in 1550nm wavelength and come typically as a single emitter source.

ams is a leading supplier in the VCSEL emitter technology. Our high power VCSELs can differentiate in scan and flash applications as they are very stable over temperature, are less sensitive to individual emitter failures, and are easy to integrate. However, the best characteristic of VCSELs are their ability to form emitter arrays. This makes VCSELs easy to scale. It also allows for addressability, or powering selective zones of the die. This enables True Solid State topology, which we consider to be the most all-rounded LIDAR solution.

LIDAR enables Autonomous Driving

The most commonly accepted way to classify vehicles on their level of autonomy is by the definitions of the Society of Automotive Engineers (SAE). At SAE Level 3 and above, the vehicle takes over responsibility from the driver and assistance turns into autonomy. This means the vehicle should be able to perform its task without human supervision and intervention. This requires a step function in required system performance. Where Level 1 and Level 2 vehicles assist the driver and typically rely on camera or radar, or a combination, there are shortcomings in these technologies for 3D object detection. LIDAR technology addresses this, and there is wide consensus in the industry that from Level 3 onwards, LIDAR is needed for 3D object detection.

When 3D LIDAR is combined or fused with camera and radar, a high-resolution map of the vehicle’s surroundings can be constructed and allow the vehicle to safely fulfil its mission. The automotive industry started with more straightforward driver-assist use cases used in Level 1 and Level 2. As sensors and data processing gets more advanced, further more difficult use cases can be covered, such as Highway Pilot or City Pilot.

Ultimately, when every conceivable use case can be fulfilled by the system we define this as a Level 5 vehicle – fully autonomous and the holy grail of autonomous driving. This is expected to still be quite a number of years out from today. Moreover, there will be huge pressure to bring down cost and rationalize content per vehicle – to make autonomous driving available to the mass market.

Interested to learn more?

Let us know if you would like to discuss how you could be using ams technology to support your potential LIDAR applications!
Contact ams sensor experts

 

7 Essential Elements Accelerating 5G Rollouts

 

5G is no longer just a promise—it’s very real, even though implementation is in its infancy. There are two examples from 2019 that demonstrate that 5G implementations are materializing. One is that Verizon launched 5G service in all its NFL football stadiums. The other example is that in South Korea, 5G subscribers reached more than 2 million by August of that year – just four months after local carriers commercially launched the technology. In this post, we explore what’s advancing 5G in these areas such as small cell densification, spectrum gathering, spectrum sharing and massive MIMO. Although it will take time to become ubiquitous, 5G is expected to be the fastest-growing mobile technology ever. According to the Global Mobile Supplier Association (GSA), 5G is expanding at a much faster pace than 4G LTE—approximately two years faster. GSA recently published data stating that more than 50 operators launched 5G mobile networks and at least 60 different 5G mobile devices are available across the world.

Ultimately, 5G will have a life-changing impact and transform many industries. However, for 2020, operators are focusing on supporting the first two major 5G use cases: faster mobile connectivity and fixed wireless access (FWA), which brings high-speed wireless connectivity.

The rapid pace of 5G development is highlighted in the 2nd edition of Qorvo’s 5G RF For Dummies book. This NEWLY UPDATED book describes key trends and technology enablers that are bringing 5G visions to life.

Here are some highlights in the book:

  1. Network Densification and Small Cells

5G users will require more cell sites to greatly expand network capacity and support the increase in data traffic. This is prompting mobile network operators (MNOs) to rush and densify their networks using small cells—which are small, low-powered base stations installed on buildings, attached to lamp posts, and in dense city venues. These small cells will help MNOs satisfy the data-hungry users, improving quality-of-service.

  1. Spectrum Gathering

5G requires vast amounts of bandwidth. More bandwidth enables operators to add capacity and increase data rates so users can download big files much faster and get jitter-free streaming in high resolution. The physical layer and higher layer designs are frequency agnostic, but separate radio performance requirements are specified for each. The lower frequency range (FR1), also called sub-7 GHz, runs from 410 to 7,125 MHz. The higher frequency range (FR2), also called millimeter Wave (mmWave), runs from 24.25 to 52.6 GHz.

5G RF For Dummies, Second Edition

5G RF For Dummies, Second Edition
Download and read this NEW UPDATED VERSION of our 5G RF For Dummies Book

Download the free e-book

To obtain the bandwidth in FR1 and FR2, more spectrum must be allocated. Already, regulators in roughly 40 countries have allocated new frequencies and enabled re-farming of LTE spectrum. However, much more will be needed. To provide at least some of that, 54 countries plan to allocate more spectrum between now and the end of 2022, according to the GSA.

  1. 4G to 5G Network Progression

5G Radio Access Network (RAN) is designed to work with existing 4G LTE networks. 3GPP allowed for multiple New Radio (NR) deployment options. Thus, making it easier for MNOs to migrate to 5G by way of a Non-Standalone (NSA) to Standalone (SA) option, as shown in the figure below.

Transition of 5G Deployment Infographic

  1. Dynamic Spectrum Sharing

Dynamic spectrum sharing (DSS) is a new technology that can further help smooth the migration from 4G to 5G. With DSS, operators can allow 4G and 5G users to share the same spectrum, instead of having to dedicate each slice of spectrum to either 4G or 5G. This means operators can use their networks more efficiently and optimize the user experience by allocating capacity based on users’ needs. Thus, as the number of 5G users increases, the network can dynamically allocate more of the total capacity to each user.

  1. Millimeter Wave (mmWave)

5G networks can deliver the highest data rates by using mmWave FR2 spectrum, where large expanses of bandwidth are available. mmWave is now a reality: 5G networks are using it for FWA and mobile devices and will apply it for other use cases in the future. Operators expect to roll out FWA to more homes, as 5G network deployment expands and suitable home equipment becomes available.

  1. Massive MIMO

MIMO (multiple-input and multiple-output) increases data speeds and network capacity by employing multiple antennas to deliver several data streams using the same bandwidth. Many of today’s LTE base stations already use up to 8 antennas to transmit data, but 5G introduces massive MIMO, which uses 32 or 64 antennas and perhaps even more in the future. Massive MIMO is particularly important for mmWave because the multiple antennas focus the transmit and receive signals to increase data rates and compensate for the propagation losses at high frequencies. This brings huge improvements in throughput and energy efficiency.

  1. RFFE Innovations that Enable 5G

Innovation in RF front-end (RFFE) technologies are needed to truly enable the vision of 5G. As handsets, base stations and other devices become sleeker and smaller, the RFFE will need to pack more performance into less space while becoming more energy-efficient. Some RF technologies are key in achieving these goals for 5G. They include:

  • Gallium Nitride (GaN). GaN is well suited for high-power transistors capable of operating at high temperatures. The potential of GaN PAs in 5G is only beginning to be realized. Their high RF power, low DC power consumption, small form factor, and high reliability enable equipment manufacturers to make base stations that are smaller and lighter in weight. By using GaN PAs, operators can achieve the high effective isotropic radiated power (EIRP) output specifications for mmWave transmissions with fewer antenna array elements and lower power consumption. This results in lighter-weight systems that are less expensive to install.
  • BAW Filters. The big increase in the number of bands and carrier aggregation (CA) combinations used for 5G, combined with the need to coexist with many other wireless standards, means that high-performance filters are essential to avoid interference. With their small footprint, excellent performance, and affordability, surface acoustic wave (SAW) and bulk acoustic wave (BAW) filters are the primary types of filters used in 5G mobile devices.

 

-Blog from https://www.qorvo.com/

Author – David Schnaufer
Technical Marketing Communications Manager

David is the public voice for Qorvo’s applications engineers. He provides technical insight into RF trends as well as tips that help RF engineers solve complex design problems.

New multi-channel spectral sensor from ams, the AS7341, set to transform the market for mobile color and light measurement

Premstaetten, Austria  (09 January, 2019) — ams (SIX: AMS), a leading worldwide supplier of high performance sensor solutions, today launched a miniature spectral sensor chip that brings laboratory-grade multi-channel color analysis capability to portable and mobile devices.

In end products such as mobile phones or accessories, the new AS7341 from ams produces more precise spectral measurements in a wider range of lighting conditions than competing sensors. The new sensor’s small dimensions also mean that it is easier to accommodate it in mobile phones and other portable devices.

“The AS7341 marks a breakthrough in the category of spectral sensors in a small package suitable for mounting in a mobile phone or consumer device. It is the smallest such device to offer 11 measurement channels, and also offers higher light sensitivity than any other multi-channel spectral sensor aimed at the consumer market,” says Kevin Jensen, Senior Marketing Manager in the Optical Sensors business line at ams.

Consumer benefits of the AS7341 include improved performance in mobile phone cameras, as the chip’s accurate spectral measurements enable superior automatic white balancing, more reliable light source identification and integrated flicker detection. The technology will more accurately reproduce colors and minimize distortion of ambient light sources, resulting in sharper, clearer and more true-to-color photographs. The AS7341 also will enable consumers to use their mobile devices to match the colors of objects such as fabrics through using color references like the PANTONE® Color System.

The power of the AS7341 to upgrade color measurement performance is demonstrated by the introduction of the Spectro 1™ portable colorimeter from Variable (www.variableinc.com). In the Spectro 1, Variable has used the AS7341 to provide professional color measurement for solid colors at a consumer price point. The product provides highly repeatable spectral curve data in 10nm increments across the visible light spectrum from 400nm to 700nm – a capability previously only available in professional spectrophotometers costing more than ten times as much as the portable Spectro 1.

“In our opinion, no other spectral sensor IC comes close to offering the multi-channel capability of the AS7341 from ams in such a compact chip package,” says George Yu, CEO of Variable. “This small size is a crucial benefit – integration with a mobile phone app is one of the key features of Spectro 1, and we have designed the product to be small enough to hold easily in one hand. And the multi-channel spectral measurements provided by the AS7341 mean that users of Spectro 1 will never be misled by false matching of metameric pairs.”

The AS7341 is a complete spectral sensing system housed in a tiny 3.1mm x 2.0mm x 1.0mm LGA package with aperture. It is an 11-channel device which provides extremely accurate and precise characterizations of the spectral content of a directly measured light source, or of a reflective surface. Eight of the channels cover eight equally spaced portions of the visible light spectrum. The device also features a near infrared channel, a clear channel, and a channel dedicated to the detection of typical ambient light flicker at a frequency of 50Hz upto 1kHz.

Beside camera image optimization, the AS7341 spectral sensor also supports various applications, such general color measurement of materials or fluids, skin tone measurement, and others.

The AS7341, which will be demonstrated at CES 2019 (Las Vegas, NV, 8-11 January 2019) is available for sampling. Mass production starting in February. Unit pricing is $2.00 in an order quantity of 10,000 units.

An evaluation board for the AS7341 spectral sensor is available. For sample requests or for more technical information, please go to >> ams.com/AS7341.

New CSG14k image sensor from ams provides 12-bit output in 14Mpixel resolution for use in high-throughput manufacturing and optical inspection

New CSG14k image sensor from ams provides 12-bit output in 14Mpixel resolution for use in high-throughput manufacturing and optical inspection

Premstaetten, Austria (6 November, 2018) — ams (SIX: AMS), a leading worldwide supplier of high performance sensor solutions, today introduced a new global shutter image sensor for machine vision and Automated Optical Inspection (AOI) equipment which offers better image quality and higher throughput than any previous device that supports the 1” optical format.

The new CSG14k image sensor features a 3840 x 3584 pixel array, giving 14Mpixel resolution at a frame rate considerably higher than any comparable device on the market offers today. The CSG14k’s 12-bit output provides sufficient dynamic range to handle wide variations in lighting conditions and subjects. The sensor’s global shutter with true CDS (Correlated Double Sampling) produces high-quality images of fast-moving objects free of motion artefacts.

The high performance and resolution of the CSG14k are the result of innovations in the design of the sensor’s 3.2µm x 3.2µm pixels. The new pixel design is 66% smaller than the pixel in the previous generation of 10-bit ams image sensors, while offering a 12-bit output and markedly lower noise.

The superior image quality and speed of the CSG14k provide important advantages in high-throughput production settings, allowing machine vision equipment to take a more detailed and accurate picture of objects moving along the production line at higher speed. The sensor is suitable for use in applications such as Automated Optical Inspection (AOI), sorting equipment, laser triangulation and other measurement instruments, and robotics.

The CSG14k offers various configuration settings which enable the operation of the sensor to be tuned for specific application requirements. Configuration options include low-power modes at reduced frame rate, and optimizations for low noise and high dynamic range. The device has a sub-LVDS output interface which is compatible with the existing CMV family of image sensors from ams.

The CSG14k is housed in a 218-pin, 22mm x 20mm x 3mm LGA package which is compatible with the 1” lenses widely used in small form factor camera designs.

“Future advances in factory automation technology are going to push today’s machine vision equipment beyond the limits of its capabilities. The breakthrough in image quality and performance offered by the CSG14k gives manufacturers of machine vision systems headroom to support new, higher throughput rates while delivering valuable improvements in image quality and resolution,” said Tom Walschap, Marketing Director in the CMOS Image Sensors business line at ams.

The CSG14k will be available for sampling in the first half of 2019.