Age | Commit message (Collapse) | Author | Files | Lines |
|
This was broken after the select for update changes. We really should
split the whole filesonly update into another method instead of the
current shotgun approach with conditionals everywhere.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This adds a bunch of transaction magic and SELECT FOR UPDATE stuff to
reporead to cope with the now-concurrent runs of reporead we get when
invoked from our inotify-based updater. The collision occurs with 'any'
architecture packages as both repo databases contain the new version,
and the updates occur at exactly the same time.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This prevents the reporead job from taking over time from more important
processes; this is not a rush task.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This prevents an otherwise idle connection from sitting around and being
totally useless.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This prevents memory usage from ballooning to absolutely huge values,
such as when multiple threads kick off at the same time. The bulk of our
memory allocation obviously comes in these threads and not the main
threads, so being able to isolate them in processes helps a lot.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This is the new on-the-fly updates hotness. Rather than continue to
schedule reporead to run once an hour in cron or however else you ran
it, this command can be run once and left running, and will
automagically pick up on any database file changes and run an import.
It operates on the files databases only; this will keep both the
packages and files always in sync and remove the delay in updating,
especially helpful for new testing packages.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Ensure we can accept either a Arch object or an architecture name when
passed to read_repo() by moving the validation there and being a bit
more careful about typechecking and object lookup.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Fuck you too, Django.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This makes them totally unusable for any real purpose down the road.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This grabs all the PGP keys from the developer profiles and adds them to
the keyrings. Obviously we may want to do more in the future such as
filter by groups, active status, etc. but this is just a first
iteration.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
And add eventual display code for it to the details template, but don't
show it yet as no packages will have it.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This allows quick resolution of all unmatched packages, especially after
tweaking the way find_user works.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This moves the cache inside an instance. Also add a few more tests.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This could be handy elsewhere as well, and it is loosely coupled to
anything else in reporead.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
* Parse builddate when reading from repo database file
* Use defaultdict where it comes in handy
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
The real reason I originally added transactions to this code was to
prevent half-updates; e.g. a package gets in without the matching
depends values. We can safely commit between packages and resume
processing the database at a later time.
Take advantage of this fact and commit every so often in batch fashion
if we have a lot of updates piling up. In the case of updating the files
DB, this can really cut down on the need to hold open a long-running,
statement heavy transaction and get the information public faster.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Now that we aren't seeing odd segfaults and hung tasks, we can remove
the traceback stuff from the scripts. Also use the 'io' module only, it
has been long enough.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Rather than the twisted mix of local times and UTC times we currently have.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Ensure all our multivalued attributes already exist on the object
beforehand, and add some special sauce to handle the difference between
a package without files and a database without files entries.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This will come in more handy with our new models, but we can adapt groups
and licenses to use it first.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This comes with pacman 3.5, replacing the old "force" PKGBUILD option.
We parse it and store it for now, but don't display it anywhere just
yet. Also update a few queries relying on version differences in any of
the multiple parts.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
|
|
When importing over a million files, it makes sense to take the slightly
faster route and call the PackageFile() constructor directly rather than
going through the related manager's create method.
We can also get huge performance improvements, especially with files
databases, by using the 'io' rather than 'codecs' module. The former is
now implemented in C in 2.7 and results in a no-work import (so
measuring only the DB read speed) of extra.files.tar.gz from ~30 seconds
to ~5 seconds.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
This allows us to store multiple licenses per package in a more elegant
fashion, and will later allow us to search and filter on this information.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Main change is just to move groups from the default packagegroup_set
location to a related_name of groups. Also refer to the Package class
directly rather than by text string if we have it available.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Don't use 'fmtstr % (arg1, arg2)' type format; logger can be passed a format
string and the arguments to populate it. Saves a bit of work for strings
that never end up getting displayed anyway.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
We didn't verify that the version in the files database was the same as in
the SQL side of things, so we could load old files for a new package and
lose track of this fact. When loading files, ensure the database version
matches the version in the package before continuing with the file load
operation.
There are also a few other small updates in here, like skipping the sanity
check for filesonly as we never delete packages, and removing some
unnecessary string concatenation operations.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
And make filename check more lenient.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
Cleanup to some of the orphan code cleanup, especially so we are never lying
in the percentage we print, and remove a bunch of debug prints that aren't
all that useful.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
We had this set up as a unique ForeignKey before, which adds some
indirection due to the RelatedManager object being there. By making it a
OneToOneField, we can get the profile object directly, enforce uniqueness,
and also use it in select_related() calls to make our profiles page a bit
more efficient.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
|
|
This needed a little sprucing up as it has grown quite organically over the
life of this script. Make things a bit more pythonic through the use of
iterators rather than collection indexing, and try to generalize the special
cases of things a bit.
Also catch encoding problems early and fail gracefully rather than blow up
the entire package parser. A failed decode of a file should cause us to just
skip it rather than stop the entire parser. Worst case, this leaves that
package out of the web interface.
Signed-off-by: Dan McGee <dan@archlinux.org>
|
|
When I have caught reporead behaving badly on the production box, I haven't
been able to successfully get a traceback without killing the process.
Hopefully using a different signal will allow me to actually capture some
data.
Signed-off-by: Dan McGee <dan@archlinux.org>
|