Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A big thing that trips people up until they try to use a public project (from source) or an older project, is the concept of a dependencies file and a lock file.

The dependency file (what requirements.txt is supposed to be), just documents the things you depend on directly, and possibly known version constraints. A lock file captures the exact version of your direct and indirect dependencies at the moment in time it's generated. When you go to use the project, it will read the lock file, if it exists, and match those versions for anything listed directly or indirectly in the dependency file. It's like keeping a snapshot of the exact last-working dependency configuration. You can always tell it to update the lock file and it will try to recaclulate everything from latest that meets your dependency constraints in the dependency file, but if something doesn't work with that you'll presumably have your old lock file to fall back on _that will still work_.

It's a standard issue/pattern in all dependency managers, but it's only been getting attention for a handful of years with the focus on reproducibility for supply chain verification/security. It has the side effect of helping old projects keep working much longer though. Python has had multiple competing options for solutions, and only in the lat couple years did they pick a format winner.



Then why even have the dependency file?

If the dependency file is wrong, and describes versions that are incompatible with the project, it should be fixed. Duplicating that information elsewhere is wrong.

Lockfiles have a very obvious use case: Replicable builds across machines in CI. You want to ensure that all the builds in the farm are testing the same thing across multiple runs, and that new behaviors aren't introduced because numpy got revved in the middle of the process. When that collective testing process is over, the lockfile is discarded.

You should not use lockfiles as a "backup" to pyproject.toml. The version constraints in pyproject.toml should be correct. If you need to restrict to a single specific version, do so, "== 2.2.9" works fine.


>Then why even have the dependency file?

Horses for courses.

Dependency files - whether the project's requirements (or optional requirements, or in the future, other arbitrary dependency groups) in `pyproject.toml`, or a list in a `requirements.txt` file (the filename here is actually arbitrary) don't describe versions at all, in general. Their purpose is to describe what's needed to support the current code: its direct dependencies, with only as much restriction on versions as is required. The base assumption is that if a new version of a dependency comes out, it's still expected to work (unless a cap is set explicitly), and has a good chance of improving things in general (better UI, more performant, whatever). This is suitable for library development: when others will cite your code as a dependency, you avoid placing unnecessary restrictions on their environment.

Lockfiles are meant to describe the exact version of everything that should be in the environment to have exact reproducible behaviour (not just "working"), including transitive dependencies. The base assumption is that any change to anything in the environment introduces an unacceptable risk; this is the tested configuration. This is suitable for application development: your project is necessarily the end of the line, so you expect others to be maximally conservative in meeting your specific needs.

You could also take this as an application of Postel's Law.

>Lockfiles have a very obvious use case: Replicable builds across machines in CI.

There are others who'd like to replicate their builds: application developers who don't want to risk getting bug reports for problems that turn out to be caused by upstream updates.

> You should not use lockfiles as a "backup" to pyproject.toml. The version constraints in pyproject.toml should be correct. If you need to restrict to a single specific version, do so, "== 2.2.9" works fine.

In principle, if you need a lockfile, you aren't distributing a library package anyway. But the Python ecosystem is still geared around the idea that "applications" would be distributed the same way as libraries - as wheels on PyPI, which get set up in an environment, using the entry points specified in `pyproject.toml` to create executable wrappers. Pipx implements this (and rejects installation when no entry points are defined); but the installation will still ignore any `requirements.txt` file (again, the filename is arbitrary; but also, Pipx is delegating to Pip's ordinary library installation process, not passing `-r`).

You can pin every version in `pyproject.toml`. Your transitive dependencies still won't be pinned that way. You can explicitly pin those, if you've done the resolution. You still won't have hashes or any other supply-chain info in `pyproject.toml`, because there's nowhere to put it. (Previous suggestions of including actual lockfile data in `pyproject.toml` have been strongly rejected - IIRC, Hatch developer Ofek Lev was especially opposed to this.)

Perhaps in the post-PEP 751 future, this could change. PEP 751 specifies both a standard lockfile format (with all the sorts of metadata that various tools might want) and a standard filename (or at least filename pattern). A future version of Pipx could treat `pylock.toml` as the "compiled" version of the "source" dependencies in `pyproject.toml`, much like Pip (and other installers) treat `PKG-INFO` (in an sdist, or `METADATA` in a wheel) as the "compiled" version (dependency resolution notwithstanding!) of other metadata.


> Lockfiles are meant to describe the exact version of everything that should be in the environment to have exact reproducible behaviour (not just "working"), including transitive dependencies.

It is wrong to specify the versions for your transitive dependencies except to achieve reproducible builds, as in CI or other situations. If a dependency fails to correctly describe their requirements in their pyproject.toml it is a bug and should be fixed in the relevant upstream.

> There are others who'd like to replicate their builds: application developers who don't want to risk getting bug reports for problems that turn out to be caused by upstream updates.

If your application only works with specific versions of dependencies and has bugs in others, that should be described in pyproject.toml.

> In principle, if you need a lockfile, you aren't distributing a library package anyway.

It's irrelevant what kind of project you're describing, library, application, build-tooling, etc. Your pyproject.toml should contain the correct information for that project to run. If your project cannot run, contains bugs, or otherwise needs more specific information than is present in the pyproject.toml the answer is to fix the pyproject.toml.

Reproducible builds for the purpose of testing or producing hash-identical wheels, or similar situations where the goal is producing an exact snapshot of a build, is the only reason to be using lockfiles. None of those use cases aligns with, for example, tracking the lockfile in source control.


>It is wrong to specify the versions for your transitive dependencies except to achieve reproducible builds

Yes, and this is why many people have both pyproject.toml and requirements.txt. pyproject.toml is meant to specify abstract, unresolved dependencies only.

>If your application only works with specific versions of dependencies and has bugs in others, that should be described in pyproject.toml.

That's quite literally not the design, if by "dependencies" you mean including transitive dependencies. pyproject.toml isn't there to enable reproducible builds. This is exactly why I included that speculation about the post-PEP751 future: currently we can "install applications", but with tools that aren't meant to handle exact application configurations.

> Reproducible builds for the purpose of testing or producing hash-identical wheels, or similar situations where the goal is producing an exact snapshot of a build, is the only reason to be using lockfiles.

Some application developers would say that, as far as they are concerned, the code "cannot run" except in the context of a reproducible build. If it works otherwise, that's an "upside", but they don't want to support it.

I think we're just going around in circles here.


> That's quite literally not the design, if by "dependencies" you mean including transitive dependencies.

I don't. Your transitive dependencies are not your problem, they are upstream's problem. Anything regarding version requirements of such packages belongs upstream.

> pyproject.toml isn't there to enable reproducible builds.

Agreed. The repository itself should not contain anything related to reproducible builds. Reproducible builds are a packaging concern not part of the source of the project itself. Ex, it would be appropriate to ship a lockfile alongside a wheel or a tarball that specifies it is the lockfile used to produce that particular build of the project; but both the wheel and the lockfile exist outside the context of the project itself.

> Some application developers would say that, as far as they are concerned, the code "cannot run" except in the context of a reproducible build.

Yes, they are wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: