Our next speaker is Olivier Etienne. He works
at Orange, and we have had quite a few talks here talking about testing on mobile. Olivier
is going to focus about something completely different. How do you test on a set-top box.
So take it away.>>Olivier Etienne: Hi, everybody.
So I’m Olivier Etienne. I work at Orange. Orange is a telecommunication operator that
delivers mobile solution, Internet access, and also TV.
And today, I want to share with you the experience we had in this context. So it’s a kind of
exotic context, because we saw a lot of test automation on the mobile phone, on the server
side, on classic context. You will see mine is a bit different.
So I work in a team that’s aiming to help other development teams. That is our function.
So we work on any kind of transversal context problems. And test automation is one of them.
So first, a bit of vocabulary. Anyone knows what is an STB?
Okay. Three people. STB stands for set-top box. Set-top box, basically,
is a device that provides TV over the Internet. And like a lot of devices that have very few
resources, we have to row reboot it, because if we don’t do that, you know the end of my
sentence. We will crash and we have problems. But you don’t have to worry to do that. We
will do that for you. So when the STB goes into deep standby because the user didn’t
use it for three hours, we have systems that automatically reboot it, release all the memory
and put it in the low consumption mode. Okay. And also the aim of this device, of
this kind of plastic box that you plug into your TV, is to deliver services. Services
can be VOD, electronic program guide, some kind of tools to watch the TV, and any kind
of tool that you can find on your TV. For a developer, STB, so the set-top box,
is a kind of system like any other kind. There is a hardware layer that manages all the hardware,
the CPU, the tuner, the video card, and so on.
But there is a software layer with drivers, with the operating system. In our case, it’s
a very tiny Linux. It’s a busy box. And on top of that, there is a browser, a classic
Web browser. And last layer is the business layer, where
we are developing our TV application. This business layer can be in different technology,
in the Native C++ Qt language, it can be Web browser based, or it can be Android.
In our case, I’m going to talk about what we’ve done with Web browser-based.
Okay. And sometimes it can be a mix of different technology. And it’s a big problem.
Okay. Next, what is a STB for Orange? Orange has so many customers. Its customers
are in different countries. So for Orange, STB is a device that is in the customer’s
living room. And there is some million of devices widespread all over different continents.
There are so many different kind of STB, so many kind of device. And also many different
countries. Our solution is deployed in France, Poland, Spain, and Senegal.
But with that, we just have one code to manage all these different configurations, hardware
configuration, country configuration. I think we can call that fragmentation, just like
in the Android context. Yeah. As you can see, there is so many devices.
This is a sample of the different kind of STB we have to deal with. So there is — there
is some kind of low-cost, other premium boxes, or it can be a set-top box that manages satellite
connection, or DVBT, any kind of context. So now the question is how do you test your
TV applications in this fragmented context? So let’s get back one year and a half ago
in the past. So we have our TV and the STB on one side. And you know what? To do that,
we do just like you. We take a remote control — okay, we don’t have beer, we don’t have
a sofa and Cheetos. We take specification and test the test suite and all the description
of what the product is supposed to do, and we take the remote control and we play on
the different keys in order to make the STB change the channel, enter the menu, et cetera.
And when all of this is finished, we have to fill a report to say, okay, this version
is certified. We can deliver this release to our QA team that is dedicated to end to
end test. If we take a look at the development teams,
we are working in agile mode. So we have sprints. Our sprints are generally in four weeks. We
have three weeks of development and one week of test. During this week of test one year
ago, you put your keyboard and your mouse away. You take your remote control, and you
just do that. You can imagine what — how the developer liked that.
And because of our context, of the fragmentation, this is a really boring activity, and this
is also something that is not very efficient. Because we have to spend a lot of time — 25%
of our activity — and during these 25% of activity. We have to compile our application
in different context, deploy it, set up the box, reset everything, and start our test
for maybe three different countries. And we have to run the test on the things that we
have done, on the new functionality, but we have also to test all the other functionality
in order to be sure that we don’t introduce some regressions. So it’s quite impossible
to test everything, even when we identify the things that may have changed. One week
is not enough. And most of all, we are not confident in what has been tested, because
— because one developer can test in one way, one other one is a specialist of VOD or another
context, and he can do more accurate tests. And this is not the same person who tests
the application from sprint to sprint. Because of that, all the non regression campaigns
are very expensive, because we can’t test everything, so we have to focus really on
what has changed. And it’s difficult to identify. And this leads to a kind of fear to make any
change. And as you can imagine, in an application that is five years old, so kind of legacy
code, if you’re not able to do refactoring because you’re not able to test, this is a
kind of problem, and this affects your productivity and the quality of your product.
So everybody was happy. And most of all, everybody agrees that we
have to do something to solve this problem, and we need to automate all these activities.
The automation objective was first to reduce the test campaign because in one week, one
week is not enough. We need maybe two weeks and maybe more. We have to do nonregression
tests, functional tests, and most of all, we want to be confident in what we do. We
know we have big refactoring to do in the code, and without anything to protect us against
regression, we can’t do other things. And cherry on the cake, we have to use the
target environment. Because on STB, in our context, we didn’t have any kind of emulator.
So — in fact, we have a very tiny emulator. But we can’t use it to make test. So we have
to work directly on the target environment on the STB, so on this white box with the
card inside that hooks more or less to a Linux system, but with a very few commands.
And if it can be low cost, it will be better, because we can distribute the solution to
any developer and spread it among the different teams, QA teams and so on, et cetera.
And if it’s easy to use, it will be better, because we can involve all the people and
developers in these activities. To do that, so to replace developer during
the test phase, we tried to hire the best. So we tried to hire robots. But it’s robots
from the ’90s, so they’re a bit rusty. And I don’t know if you remember how these
kind of robots see our world. It’s something like that. Something in red and white or green
and black, low resolution. They have to do some kind of analysis to find what is exactly
in this world that they see with a camera. And so we didn’t recruit them. But you will
see that we’ve done quite the same. We didn’t recruit them, but we still want
to have robots. [ Laughter ]
This is a real image. There is a tool that worked like that, just to test how the STB
works with all the gyroscope. And this is based on LEGO. But this worked a lot.
So let’s get back to our robots from the ’90s. In fact, we have tried three kind of tools.
It’s Witbe, TAKT engine and STB-tester. All those tools work quite the same. There is
an infrared editor that simulates the remote control. And there is some electronic device
to do video capture and video analysis. So you plug it into your TV, and those are screenshots.
And after that, the screenshots, you have a lot of work. And that’s the main problem
of this solution. It’s expensive for some of the solution. I think the Witbe solution
is 10,000 euros. So if you wanted to give one to each developer, it’s a bit expensive.
And the other problem is that to create and use the test is also very expensive. Because
you have the first phase to do is to teach your tool what to test. So you have to do
screen capture of any of your UIs. And in the screen captures, you have to identify
the things you want to test. And you have to say, okay, this is a button. Okay, this
is a text field. If it’s a text field, you have to do text recognition in order to capture
the text inside. So very difficult. But it could be okay if there wasn’t another
problem, that was UI that changes. We are developing TV application, and the UI may
change in style, the colors, the position of widgets. And all of these three solutions
are very sensitive to changes. So it doesn’t work.
And if we get back to our ’90s robots, we have systems that work quite the same. They
do screenshots, and they try to find in the screenshot, in the image, what are the elements
that they have to monitor. And the other thing that is a bit weird for
And from the moment where we deploy that on the TV, we can just have screenshots. Other
they don’t have to take a screenshot and a JPEG image in order to automate their tests.
So this wasn’t a good solution. We have a lot of problems with that. And we’ve tried
to take a different approach. So in our context, we have two worlds. One
world is the infrared world, with remote control on one side and our set-top box on and the
TV on the other side. On the new system, on the set-top box, we don’t have a lot of possibility
to change things to more tools, because this is a very tiny Linux with not a lot of functions.
I think there is cd, ls and maybe other commands, but not a lot.
And in the other world, we have the tools that all the developers know, Selenium, Jasmine,
Mocha. Any kind of tools that are used to test Web pages.
And between those two worlds, nothing. But I, developer, I hate to do repetitive
tasks. I believe that machine should work for me. And I’m lazy.
So I have to find a solution. And two years ago there is a new system, a
new small card. I don’t know, maybe a lot of you know this card. It’s called raspberry
pi. Let’s see how we use this raspberry pi. So
we have our first world with the infrared, and we decide to put a raspberry pi just in
the middle. On this raspberry pi, we had a card with an infrared receiver and an infrared
blaster to allow this raspberry pi to talk in the infrared world.
So our raspberry pi now knows how to read infrared and send infrared.
But raspberry pi has also an ethernet card. So raspberry pi can also understand HTTP request.
And now we have our system that know how to talk with the two worlds. So our raspberry
pi will be used as the mediator or the glue between the two worlds.
And we started to add new web services on this raspberry pi, such as “send a key”
to simulate a key on the remote control, or send a sequence of keys in order to simulate
user interaction that go on one channel, zap, zap, on the next one, and so on.
And okay, and we also add the responsibility to receive screen dumps. So it’s able to receive
screen description coming from the STB. This — So let’s see what does — When we
start from our little computer on the right of the screen. So this is the developer world.
We start by sending — okay — an HTTP request to send an order to send a key. This HTTP
request is received by the raspberry pi. It knows how to send an infrared signal, so he
ask to send an infrared signal. This signal is sent to the STB all the way out, so change
the channel. And we also add a small hack, in fact, on the business layer, on the application
layer on the TV because we are developers that work on that, so we have the ability
thing. It retrieve the content of the main window and is able to send it back to the
raspberry pi. So raspberry pi receive it and send it back to the coder. So now we have
a synchronous call that does HTTP on one side, manage all that has been to manage for infrared,
and send back a response to the HTTP request with the content of the screen.
Yoohoo! So take a look, a first look at the raspberry
pi and how we customize it. So it is this kind of card, so this size, and on this card
you have a Linux system, and on this Linux system we’ve decided to add some other components,
open source components. So we add an Apache, a PHP and the java engine.
On the raspberry pi we developed a shield where we add an infrared receiver, an infrared
blaster, and we also use some other libs like LIRC or pi4j in order to manage the input/output
with the electronic on the card. So now when we send an infrared signal, we
are able to catch it or to spy it with our raspberry pi. So we are able to record a key
or record a key sequence. And we are also able to emit infrared signals. Okay. Simple
like that. So let’s see what it does in real life.
We have our raspberry pi. We send in an order through an HTTP interface. We ask him to start
a key sequence. This key sequence is started, is received on the TV.
We send back a dump, and this dump is received by the caller. In fact, the caller, the emitter
of the HTTP request, was a simple Web browser. And now we have on one side what is on the
TV, and on the other side, we’ll try exactly the same content but in our browser on the
Web developer computer. So that’s a good news, because now we don’t
have just the PC of the developer on the one side and the TV on the other side, and in
some sense, my TV has become just a Web site. Okay. The TV, the PC.
We can start the first video. Okay. So on this video, on the right side
there is my computer with an SSH connection on the raspberry pi, and on the left side
we have — we have our raspberry pi and the STB.
So here it is in spy mode. This means that I switch from channel and you can see here
the raspberry pi spy all the keys that have been pressed, and it store that scenario.
This means that now, if I recall the same scenario from Web browser, from my Mozilla,
this scenario is played on the TV. This is a bit dark but on the left, this is my TV.
So I can command my TV from my Web browser, and at the end of the scenario, I receive
— oops. Okay. Can we just show back the — No, we can’t. Okay. No problem.
That was the video. Okay. So we move from specific and exotic context
with the STB that is the a kind of embedded device that has not a lot of functionality
to something more standard like a Web site. And that’s great news for a developer because
when we go back to a standard world, we retrieve all the tools of this world.
And if we take a look at what we’ve received in our browser, so this is my browser with
a firebug plug-in. In fact, I don’t have a screenshot, I don’t have an image. I have
the exact DOM content. So this is no more screenshot. This is the real page. I got exactly
the same thing, the same structure that the one that was used in my STB.
So I can check the CSS, I can check any DOM element, its property, its content, et cetera.
So I have the real DOM. And also good news. I will be able to use
standard tools to send an automatic test and create a test suite. So we decide to use Selenium
IDE. Selenium IDE manage tests, tests suites. Each suite is composed of test case, classic.
And in these test cases, we have two kind of activity. We can send order to the raspberry
pi, send key or key sequence, or do any kind of stuff. We’re able to work with the bugs,
to send a message to the Linux console. Many, many things.
And we have also the ability to do some assertions. So to verify, to check that what is displayed
on the screen is what was expected. I think we have another video. Yeah.
So now let’s replay the same thing from my browser. I launch my test suite that is running
on my TV here (on the left). So the raspberry play exactly the same sequence
that he has store, and at the end of seconds it will send back a screenshot to Selenium,
and you will see it’s very fast. Okay. Everything is green.
Whew. So new we make a big jump in the 21st century.
We started from an infrared system. We added a raspberry pi in order to use it through
HTTP request. Now we’ve plugged Selenium on it. And in fact, we used not Selenium directly
but Selenium IDE. Why Selenium IDE? Because it is quite as good as Selenium but you don’t
have to write code. It’s much simpler. It has a kind of script language. And it’s running
in a Firefox plug-in, so you don’t have anything to install on your PC in order to run this
kind of test. In fact, anyone doesn’t have to install Eclipse or it’s favorite studio
in order to write and run test. In fact, it was one of our secret objective.
We didn’t want to be alone to write tests. We want to open the test world to non developers.
So with Selenium IDE, you just have to know xpath and some kind of assertions, a bit of
scripting and it’s okay. Now if we see the big picture of where we
are, we still have our set-top box, our raspberry pi, but now we have Jenkins. So next step.
Jenkins, every night, will check the test repository because all of our test suite and
test case are store in our repository. He updates the Selenium context. He also retrieve
all the code of our TV applications. He deploys his code on the STB, reboot them, reset the
context as for any embedded device as we seen this morning. And he run the test suites.
He recover all the results. He create for us all the reports, and also is able to push
all of this in quality center. Quality center is our repository for all of our test suites.
So we can create automatically our delivery status. And another thing we added is a dashboard
in order to follow all the — the status of all the test suite, because now we have six
different STB that we test every night. So this generates a lot of data, and we need
to have a quick view on all the stature of these tests. And we are also able to generate
this on demand. So about the results. When we start this experimentation,
we just think that we will do something that’s simulated a remote control. And we’ve been
far beyond. Because, as you’ve seen, the first set was just to provide an HTTP interface.
But we did many, many, many functionality in order to get something that can be run
inside Jenkins. We have added so many functions, so now we
are able to reboot the STB, reboot by sending reboot command in the Linux console. But also
we are able to do electrical reboot, we have added some on the card of the Raspberry Pi
and control the card. We have — last week, we had more than functional
tests running every night. In fact, we have a lot of more tests. But we have some kind
of filter in order to select only the test case that have not been flaky flagged.
And to give you an idea of what we’ve gained, one year ago, one year, we had to test all
the functionality of the VOD TV application. This needs five working days. Now, with our
test suite, the tests are made every night, and it costs three hours to Jenkins. So developers
are happy now. And most of all, team has gained a lot of
confidence in its code. So we can start to do refactoring and changing our code. We have
some things that ensure us that we didn’t break everything.
Okay. About the lessons we learned. In fact, there is some key points that lead to the
success. First one is we succeed because we have full control on all our data. We have
a lot of mocks and stubs in order to simulate all the ecosystem. And we have the control
on the data. This means that we can reset the data. We can predict what will be the
data at — when we run the test, because when you would test something on the live TV contents,
you have to be sure what is the current program. Another thing is, we’ve identified that
the test activity must be made very soon in the development cycle. Because if you don’t
do that, if you start with a legacy application, like it was in our case, you will have a lot
of problems in the code, it’s quite obvious. If you want to test HTML, in the a code, if
you didn’t put the ID on the different tags, it will be a bit more difficult.
It’s the case also in your architecture. But we have identified that also some other limits
must be think in a test way. That’s the case for all our test plan. In fact, it’s quite
fun, because when we tried to — when we begin to write automatic tests, we had to refactor
our test plan, because we have big test plans that say, okay, you do that, you do that,
you do that, and you check a list of 20 or 50 items, and you can’t make a direct match
between the test plan and your automatic test suite.
Another lesson we’ve learned is “simple is better”. Simple is better when you write
automatic tests, because what we didn’t want is to debug test. We didn’t want — we prefer
to have 20 tests, 20 simple tests more than one big. Because when it crash, what we want
to know is what is the functionality that doesn’t work. And what is exactly in your
— in your UI — where is the problem? You don’t want to repeat the test, add debug instructions
to know why your — your test crashed. And last thing we have learned, everything
can’t be automated. So there are a lot of things that are too difficult where you don’t
want to create a monster in order to test something that a human can test in five seconds.
For instance, we have a lot of feedback with automated test of the video quality or some
quality. There is a lot of tools that do that, but they don’t do it well, and a human is
much better to do that. And now the question is, how do you test your
TV application? We test like that, with Jenkins, and a lot of raspberry. We start with ten
Raspberry Pi, and we are — we have a park of 40 Raspberry Pi. And this will certainly
increase from week to week or month to month. Okay. Finished. I think I’ve been a bit shorter
than I expected. But this gives us a lot of time for questions.
>>Sonal Shah: Thank you very much, Olivier. [ Applause ]
Do we have any questions on the moderator link?
So while Alan sorts out the questions, anybody have a question in the audience?
>>Alan Myrvold: One of the questions from the moderator. You mentioned the box will
send back a screenshot after the test is run. Are there other means of verification?
>>Olivier Etienne: Can you do it slowly ?>>Alan Myrvold: After the test is run, it
will send back a screenshot if it fails.>>Olivier Etienne: Yeah.
>>Alan Myrvold: Is there another way –>>Olivier Etienne: This is part of the things
we have done more than just sending HTTP request. When we started to put the test on Jenkins
and to build every night, we have a big amount of things to analyze. And we realized that
we need some more information than just an assertion that says, okay, that fails. Because
sometimes what fails, what the STBs didn’t do is reboot fast, so it’s completely frozen.
So now when there is a crash, we keep the screenshot of the screen, and we also catch
all the logs of the STB. And we associate that to the test. In order to have an easier
investigation.>>Sonal Shah: Any other questions?
Thank you so much. That was a great talk.