Discussion:
[yocto] [meta-raspberrypi][PATCH] firmware.inc: Fetch a zip instead of cloning a git repo
Jon Szymaniak
2015-06-26 04:16:28 UTC
Permalink
GitHub provides this ability to download repository contents at
a specified changeset as a zip file. This is generally *much* quicker
than fetching the entire git repository.

This resolves some do_fetch() failures I've seen for bcm2835-bootfiles
in which the clone operation takes a very long time, and the connection
eventually hangs and errors out.

Signed-off-by: Jon Szymaniak <***@gmail.com>
---
recipes-bsp/common/firmware.inc | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/recipes-bsp/common/firmware.inc b/recipes-bsp/common/firmware.inc
index ad3176a..5830bb0 100644
--- a/recipes-bsp/common/firmware.inc
+++ b/recipes-bsp/common/firmware.inc
@@ -1,8 +1,10 @@
RPIFW_SRCREV ?= "e42a747e8d5c4a2fb3e837d0924c7cc39999936a"
RPIFW_DATE ?= "20150206"
-RPIFW_SRC_URI ?= "git://github.com/raspberrypi/firmware.git;protocol=git;branch=master"
-RPIFW_S ?= "${WORKDIR}/git"
+RPIFW_SRC_URI ?= "https://github.com/raspberrypi/firmware/archive/${RPIFW_SRCREV}.zip"
+RPIFW_S ?= "${WORKDIR}/firmware-${RPIFW_SRCREV}"

SRC_URI = "${RPIFW_SRC_URI}"
+SRC_URI[md5sum] = "a0cd8bc3a82fa708e26da62350fcf485"
+SRC_URI[sha256sum] = "eebf3bbe2fda533da4b44e713090428e6c14306445543243ae03bca774894840"
SRCREV = "${RPIFW_SRCREV}"
PV = "${RPIFW_DATE}"
--
2.1.4

--
Burton, Ross
2015-06-26 08:31:14 UTC
Permalink
Post by Jon Szymaniak
GitHub provides this ability to download repository contents at
a specified changeset as a zip file. This is generally *much* quicker
than fetching the entire git repository.
Github also can and will regenerate these tarballs whenever it feels like
it, so you'll need to periodically update the checksums. Obviously as
existing developers will tend to have the tarballs cached locally, it can
be a while before this failure is reported back.

A better solution might be to add support for "depth" to the git fetcher,
so you can grab just the commit you are interested in instead of the entire
repository.

Ross
Gary Thomas
2015-06-26 09:07:05 UTC
Permalink
Post by Jon Szymaniak
GitHub provides this ability to download repository contents at
a specified changeset as a zip file. This is generally *much* quicker
than fetching the entire git repository.
Hopefully the zip file is also a bit more manageable - the cached version
of the git tree for this repo is HUGE!

3599813625 Mar 5 11:02 /work/misc/Poky/sources/git2_github.com.raspberrypi.firmware.git.tar.gz
Post by Jon Szymaniak
Github also can and will regenerate these tarballs whenever it feels like it, so you'll need to periodically update the checksums. Obviously as existing developers will tend to
have the tarballs cached locally, it can be a while before this failure is reported back.
A better solution might be to add support for "depth" to the git fetcher, so you can grab just the commit you are interested in instead of the entire repository.
Is that something that can be in the recipe, or is this ability
something that needs to be added to the bitbake/OE-core infrastructure?
--
------------------------------------------------------------
Gary Thomas | Consulting for the
MLB Associates | Embedded world
------------------------------------------------------------
--
Burton, Ross
2015-06-26 09:09:51 UTC
Permalink
Post by Gary Thomas
Is that something that can be in the recipe, or is this ability
something that needs to be added to the bitbake/OE-core infrastructure?
The fetcher needs to be patched.

Ross
Jon Szymaniak
2015-06-26 14:16:47 UTC
Permalink
Post by Burton, Ross
Post by Jon Szymaniak
GitHub provides this ability to download repository contents at
a specified changeset as a zip file. This is generally *much* quicker
than fetching the entire git repository.
Github also can and will regenerate these tarballs whenever it feels like
it, so you'll need to periodically update the checksums. Obviously as
existing developers will tend to have the tarballs cached locally, it can
be a while before this failure is reported back.
A better solution might be to add support for "depth" to the git fetcher,
so you can grab just the commit you are interested in instead of the entire
repository.
Ross
Hi Ross,

Excellent point about the regeneration potentially yielding different
checksums. I suppose they could change the compression level they use at
any moment in time... I'll look into adding that depth support to the
fetcher, as that doesn't look too hard at all.

I'm open to other suggestions as well, as this was just a first stab at it.
I've been seeing that cloning this git repo containing binary firmware
blobs takes an absurd amount of time, if it even finishes at all
successfully.

Cheers,
Jon
Burton, Ross
2015-06-26 14:19:43 UTC
Permalink
Post by Jon Szymaniak
I'm open to other suggestions as well, as this was just a first stab at
it. I've been seeing that cloning this git repo containing binary firmware
blobs takes an absurd amount of time, if it even finishes at all
successfully.
I believe github offers hosting of "release" tarballs too, so upstream
could take advantage of that. Having verified checksums of firmware is
useful from a security point of view as you can't really inspect the
sources for it...

Ross
Jon Szymaniak
2015-06-26 14:42:55 UTC
Permalink
Post by Burton, Ross
Post by Jon Szymaniak
I'm open to other suggestions as well, as this was just a first stab at
it. I've been seeing that cloning this git repo containing binary firmware
blobs takes an absurd amount of time, if it even finishes at all
successfully.
I believe github offers hosting of "release" tarballs too, so upstream
could take advantage of that. Having verified checksums of firmware is
useful from a security point of view as you can't really inspect the
sources for it...
That's actually what I looked for first, and definitely would use that if
it were available.

Generally when you apply a tag or manually create a release on GitHub, and
etnry under "Tags" or "Releases" is created. It will automatically provide
a zip and/or tar.gz of the repo sources -- I suspect this would suffer from
the same risk of changing checksums that you expressed concern over.
Therefore, it would require the upstream maintainer to upload a specific
.tar.gz, preferably with .sha256sum and .md5sum files.

Back to the git depth point... why is "--depth 1" not the default for all
cases? Could anyone elaborate on some use cases where we'd actually want
the entire history for builds?

- Jon
Burton, Ross
2015-06-26 14:46:19 UTC
Permalink
Post by Jon Szymaniak
Back to the git depth point... why is "--depth 1" not the default for all
cases? Could anyone elaborate on some use cases where we'd actually want
the entire history for builds?
I'm sure I've been told that it's not as simple as you'd expect when it
comes to varying SHAs and existing clones and so on. I may be wrong.
There's one way to find out ;)

Ross
Petter Mabäcker
2015-07-05 19:19:37 UTC
Permalink
Post by Jon Szymaniak
GitHub provides this ability to download repository contents at
a specified changeset as a zip file. This is generally *much* quicker
than fetching the entire git repository.
Github also can and will regenerate these tarballs whenever it
feels like it, so you'll need to periodically update the
checksums. Obviously as existing developers will tend to have the
tarballs cached locally, it can be a while before this failure is
reported back.
A better solution might be to add support for "depth" to the git
fetcher, so you can grab just the commit you are interested in
instead of the entire repository.
Ross
Hi Ross,
Excellent point about the regeneration potentially yielding different
checksums. I suppose they could change the compression level they use
at any moment in time... I'll look into adding that depth support to
the fetcher, as that doesn't look too hard at all.
I'm open to other suggestions as well, as this was just a first stab
at it. I've been seeing that cloning this git repo containing binary
firmware blobs takes an absurd amount of time, if it even finishes at
all successfully.
Cheers,
Jon
Hi Jon,

Any news about this? I have also used a very similar changeset like you
suggests (use .zip from github) ontop of meta-raspberrypi when building,
to get rid of the annoying problem that it takes a very long time or
even worse that you get a timeout.

My suggestion is to go for the .zip changeset at least until --depth=1
is supported in the git fetcher.

@Andrei any comments from your side regarding this discussion?

BR,
Petter
Clemens Lang
2015-07-06 05:19:54 UTC
Permalink
Hello,
Post by Burton, Ross
Github also can and will regenerate these tarballs whenever it feels
like it, so you'll need to periodically update the checksums.
Obviously as existing developers will tend to have the tarballs cached
locally, it can be a while before this failure is reported back.
While that does happen from time to time it's pretty rare. I see maybe
one case of this every couple of months in MacPorts. Additionally, after
a while the checksums generally change back again in almost all cases.

So, yes, this brings its own set of problems, but is still a worthwhile
improvement over the current situation IMO.


Best regards,
Clemens
--
Clemens Lang • Development Specialist
BMW Car IT GmbH • Lise-Meitner-Str. 14 • 89081 Ulm • http://bmw-carit.com
-------------------------------------------------------------------------
BMW Car IT GmbH
Geschäftsführer: Michael Würtenberger und Reinhard Stolle
Sitz und Registergericht: München HRB 134810
-------------------------------------------------------------------------
--
Anders Darander
2015-07-06 08:40:05 UTC
Permalink
Post by Clemens Lang
Post by Burton, Ross
Github also can and will regenerate these tarballs whenever it feels
like it, so you'll need to periodically update the checksums.
Obviously as existing developers will tend to have the tarballs cached
locally, it can be a while before this failure is reported back.
While that does happen from time to time it's pretty rare. I see maybe
one case of this every couple of months in MacPorts.
Well, we've tried this before, and the changed checksums is causing us
all a lot of problems.
Post by Clemens Lang
Additionally, after a while the checksums generally change back again
in almost all cases.
Well, then that's almost twice as bad... That means that once the
changed checksum has been detected and patches been submitted, it's
likely to change again... :(
Post by Clemens Lang
So, yes, this brings its own set of problems, but is still a worthwhile
improvement over the current situation IMO.
Well, no. Tarballs that has changing checksums is not acceptable. It's
going to break new builds, new autobuilders, etc, and just cause
everyone unacceptable pain.

It's going to give a lot of us a huge support nightmare again...

If the checksums can be guaranteed to be stable, then, yes, such a
change can be looked upon.

Shallow clones is in this case a lot more likely to be usefull, though,
implementing that might have a few issues on it's own...

Cheers,
Anders
--
Anders Darander
ChargeStorm AB / eStorm AB
--
Nikolay Dimitrov
2015-07-06 09:48:50 UTC
Permalink
Hi guys,

One issue with the regularly changing tarball checksums is that people
start to get used to thes changes (e.g. everything looks like false
positive). Currently the tarball checksums and SCM revisions are
probably the most important tool for builds traceability. If we get
used to think about these checksums as "unreliable", it will be much
easier to miss an important component change, which would otherwise
ring a bell.

Kind regards,
Nikolay
--
Paul Eggleton
2015-07-06 10:58:45 UTC
Permalink
Post by Nikolay Dimitrov
One issue with the regularly changing tarball checksums is that people
start to get used to thes changes (e.g. everything looks like false
positive). Currently the tarball checksums and SCM revisions are
probably the most important tool for builds traceability. If we get
used to think about these checksums as "unreliable", it will be much
easier to miss an important component change, which would otherwise
ring a bell.
Fully agreed.

There are a couple of things I think we can do here:

1) Implement shallow cloning in bitbake's git fetcher as suggested. This
shouldn't be too tricky. I've filed a bug to track this:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=7958

(Richard is the default assignee, but anyone could potentially work on this).

2) In the mean time we could consider upload git mirror tarballs to a source
mirror that gets enabled through meta-raspberrypi (would need to be via
PREMIRRORS to actually solve the issue). This has the advantage that it
wouldn't require any changes to the kernel recipe itself, but new tarballs
would of course need to be uploaded every time SRCREV is changed in the
recipe.

Cheers,
Paul
--
Paul Eggleton
Intel Open Source Technology Centre
--
Andrei Gherzan
2015-07-09 20:13:32 UTC
Permalink
This post might be inappropriate. Click to display it.
Nikolay Dimitrov
2015-07-10 08:47:50 UTC
Permalink
Hi Andrei,
Post by Andrei Gherzan
Finally I hop on to this discussion too.
On Mon, Jul 6, 2015 at 12:58 PM, Paul Eggleton
Post by Nikolay Dimitrov
One issue with the regularly changing tarball checksums is that people
start to get used to thes changes (e.g. everything looks like false
positive). Currently the tarball checksums and SCM revisions are
probably the most important tool for builds traceability. If we get
used to think about these checksums as "unreliable", it will be much
easier to miss an important component change, which would otherwise
ring a bell.
Fully agreed.
1) Implement shallow cloning in bitbake's git fetcher as suggested. This
https://bugzilla.yoctoproject.org/show_bug.cgi?id=7958
(Richard is the default assignee, but anyone could potentially work on this).
This should be the fix that would really fix the issue. And would be a
useful feature for many other BSPs / layers out there.
2) In the mean time we could consider upload git mirror tarballs to a source
mirror that gets enabled through meta-raspberrypi (would need to be via
PREMIRRORS to actually solve the issue). This has the advantage that it
wouldn't require any changes to the kernel recipe itself, but new tarballs
would of course need to be uploaded every time SRCREV is changed in the
recipe.
And until 1) is done, we can have a premirror. Paul, can you upload a
tarball? Can I help you with anything for having this up? After we have
this, can we force premirrors when using a specific layer? Was thinking
of forcing it by adding PREMIRRORS to layer.conf.
I don't think this is a good move. The current solution is already
working properly, although with slower-than-ideal download speed.

Prepackaged tarballs will require constant manpower for supporting,
and it's probably better to be invested into looking for a better
solution.
Post by Andrei Gherzan
Using github snapshots is not a good idea. Most of the issues you guys
pointed out above I experienced as well. In my opinion we should combine
Paul's solutions in order to address this problem.
One more thing. Given the fact the the repository we are talking about
is not under our control, we shouldn't rely on releases or other things
from the remote repository.
Andrei
Regards,
Nikolay
--

Loading...