When we originally introduced the trust-on-first-use checksum locking
mechanism in v0.14, we had to make some tricky decisions about how it
should interact with the pre-existing optional read-through global cache
of provider packages:
The global cache essentially conflicts with the checksum locking because
if the needed provider is already in the cache then Terraform skips
installing the provider from upstream and therefore misses the opportunity
to capture the signed checksums published by the provider developer. We
can't use the signed checksums to verify a cache entry because the origin
registry protocol is still using the legacy ziphash scheme and that is
only usable for the original zipped provider packages and not for the
unpacked-layout cache directory. Therefore we decided to prioritize the
existing cache directory behavior at the expense of the lock file behavior,
making Terraform produce an incomplete lock file in that case.
Now that we've had some real-world experience with the lock file mechanism,
we can see that the chosen compromise was not ideal because it causes
"terraform init" to behave significantly differently in its lock file
update behavior depending on whether or not a particular provider is
already cached. By robbing Terraform of its opportunity to fetch the
official checksums, Terraform must generate a lock file that is inherently
non-portable, which is problematic for any team which works with the same
Terraform configuration on multiple different platforms.
This change addresses that problem by essentially flipping the decision so
that we'll prioritize the lock file behavior over the provider cache
behavior. Now a global cache entry is eligible for use if and only if the
lock file already contains a checksum that matches the cache entry. This
means that the first time a particular configuration sees a new provider
it will always be fetched from the configured installation source
(typically the origin registry) and record the checksums from that source.
On subsequent installs of the same provider version already locked,
Terraform will then consider the cache entry to be eligible and skip
re-downloading the same package.
This intentionally makes the global cache mechanism subordinate to the
lock file mechanism: the lock file must be populated in order for the
global cache to be effective. For those who have many separate
configurations which all refer to the same provider version, they will
need to re-download the provider once for each configuration in order to
gather the information needed to populate the lock file, whereas before
they would have only downloaded it for the _first_ configuration using
that provider.
This should therefore remove the most significant cause of folks ending
up with incomplete lock files that don't work for colleagues using other
platforms, and the expense of bypassing the cache for the first use of
each new package with each new configuration. This tradeoff seems
reasonable because otherwise such users would inevitably need to run
"terraform providers lock" separately anyway, and that command _always_
bypasses the cache. Although this change does decrease the hit rate of the
cache, if we subtract the never-cached downloads caused by
"terraform providers lock" then this is a net benefit overall, and does
the right thing by default without the need to run a separate command.
We previously had some tests for some happy paths and a few specific
failures into an empty directory with no existing locks, but we didn't
have tests for the installer respecting existing lock file entries.
This is a start on a more exhaustive set of tests for the installer,
aiming to visit as many of the possible codepaths as we can reasonably
test using this mocking strategy. (Some other codepaths require different
underlying source implementations, etc, so we'll have to visit those in
other tests separately.)
Instead of searching the installed provider package directory for a
binary as we install it, we can lazily detect the executable as it is
required. Doing so allows us to separately report an invalid unpacked
package, giving the user more actionable error messages.
Providers installed from the registry are accompanied by a list of
checksums (the "SHA256SUMS" file), which is cryptographically signed to
allow package authentication. The process of verifying this has multiple
steps:
- First we must verify that the SHA256 hash of the package archive
matches the expected hash. This could be done for local installations
too, in the future.
- Next we ensure that the expected hash returned as part of the registry
API response matches an entry in the checksum list.
- Finally we verify the cryptographic signature of the checksum list,
using the public keys provided by the registry.
Each of these steps is implemented as a separate PackageAuthentication
type. The local archive installation mechanism uses only the archive
checksum authenticator, and the HTTP installation uses all three in the
order given.
The package authentication system now also returns a result value, which
is used by command/init to display the result of the authentication
process.
There are three tiers of signature, each of which is presented
differently to the user:
- Signatures from the embedded HashiCorp public key indicate that the
provider is officially supported by HashiCorp;
- If the signing key is not from HashiCorp, it may have an associated
trust signature, which indicates that the provider is from one of
HashiCorp's trusted partners;
- Otherwise, if the signature is valid, this is a community provider.
Historically our logic to handle discovering and installing providers has
been spread across several different packages. This package is intended
to become the home of all logic related to what is now called "provider
cache directories", which means directories on local disk where Terraform
caches providers in a form that is ready to run.
That includes both logic related to interrogating items already in a cache
(included in this commit) and logic related to inserting new items into
the cache from upstream provider sources (to follow in later commits).
These new codepaths are focused on providers and do not include other
plugin types (provisioners and credentials helpers), because providers are
the only plugin type that is represented by a heirarchical, decentralized
namespace and the only plugin type that has an auto-installation protocol
defined. The existing codepaths will remain to support the handling of
the other plugin types that require manual installation and that use only
a flat, locally-defined namespace.