During the 0.12 work we intended to move all of the variable value
collection logic into the UI layer (command package and backend packages)
and present them all together as a unified data structure to Terraform
Core. However, we didn't quite succeed because the interactive prompts
for unset required variables were still being handled _after_ calling
into Terraform Core.
Here we complete that earlier work by moving the interactive prompts for
variables out into the UI layer too, thus allowing us to handle final
validation of the variables all together in one place and do so in the UI
layer where we have the most context still available about where all of
these values are coming from.
This allows us to fix a problem where previously disabling input with
-input=false on the command line could cause Terraform Core to receive an
incomplete set of variable values, and fail with a bad error message.
As a consequence of this refactoring, the scope of terraform.Context.Input
is now reduced to only gathering provider configuration arguments. Ideally
that too would move into the UI layer somehow in a future commit, but
that's a problem for another day.
We previously deferred this to Validate, but all of our operations require
a valid set of variables and so checking this up front makes it more
obvious when a call into Terraform Core from the CLI layer isn't
populating variables correctly, instead of having it fail downstream
somewhere.
It's the caller's responsibility to ensure that it has collected values
for all of the variables in some way before calling NewContext, because
in the main case driven by the CLI there are many different places that
variable values can be collected from and so handling the main user-facing
validation in the CLI allows us to return better error messages that take
into account which way a variable is (or is not) being set.
Previously we were using the experimental HCL 2 repository, but now we'll
shift over to the v2 import path within the main HCL repository as part of
actually releasing HCL 2.0 as stable.
This is a mechanical search/replace to the new import paths. It also
switches to the v2.0.0 release of HCL, which includes some new code that
Terraform didn't previously have but should not change any behavior that
matters for Terraform's purposes.
For the moment the experimental HCL2 repository is still an indirect
dependency via terraform-config-inspect, so it remains in our go.sum and
vendor directories for the moment. Because terraform-config-inspect uses
a much smaller subset of the HCL2 functionality, this does still manage
to prune the vendor directory a little. A subsequent release of
terraform-config-inspect should allow us to completely remove that old
repository in a future commit.
The old logic for `depends_on` was to short-circuit evaluation of the
data source, but that prevented a plan and state from being recorded.
Use the (currently unused) ForcePlanRead to ensure that the plan is
recorded when the config contains `depends_on`.
This does not fix the fact that depends on does not work with data
sources, and will still produce a perpetual diff. This is only to fix
evaluation errors when an indexed data source is evaluated during
refresh.
* command/import: properly use `-provider` supplied on the command line
The import command now attaches the provider configuration in the resource
instance, if set. That config is attached to the NodeAbstractResource
during the import graph building. This prevents errors when the implied
provider is not actually in the configuration at all, which may happen
when a configuration is using the `-beta` version of a provider (and
only that `-beta` version).
* command/import: fix variable reassignment and update docs
Fixes#22564
Now that the most common cause of unknowns (invalid resource indexes) is
caught earlier, we can validate that the final apply config is wholly
known before attempting to apply it. This ensures that we're applying
the configuration we intend, and not silently dropping values.
Always return the entire resource object from
evaluationStateData.GetResource, rather than parsing the references for
individual instances. This allows for later evaluation of resource
indexes so we can return errors when they don't exist, and prevent
errors when short-circuiting around invalid indexes in conditionals.
In order to allow lazy evaluation of resource indexes, we can't index
resources immediately via GetResourceInstance. Change the evaluation to
always return whole Resources via GetResource, and index individual
instances during expression evaluation.
This will allow us to always check for invalid index errors rather than
returning an unknown value and ignoring it during apply.
The documentation for the -target option warns that it's intended for
exceptional circumstances only and not for routine use, but that's not a
very prominent location for that warning and so some users miss it.
Here we make the warning more prominent by including it directly in the
Terraform output when -target is in use. We first warn during planning
that the plan might be incomplete, and then warn again after apply
concludes and direct the user to run "terraform plan" to make sure that
there are no further changes outstanding. The latter message is intended
to reinforce that -target should only be a one-off operation and that you
should always run without it soon after to ensure that the workspace is
left in a consistent, converged state.
This interface is meant to replace the following ones (in use by some providers):
- httpclient.UserAgentString() (e.g. AzureRM, Google)
- terraform.UserAgentString (e.g. OpenStack, ProfitBricks)
- terraform.VersionString (e.g. AWS, AzureStack, DigitalOcean, Kubernetes)
This also proposes the initial UA string to be set to
HashiCorp Terraform/X.Y.Z (+https://www.terraform.io)
This also fixes a few things with resource for_each:
It makes validation more like validation for count.
It makes sure the index is stored in the state properly.
Remove reflect.DeepEqual from path comparisons to get reliable results.
The equality issues were only noticed going the grpc interface, so add a
corresponding test to the test provider.
Fix for a crash during terraform plan: If there is a multi-instance
resource (count > 1) where one of the instances was deleted in the
deployment but was still present in the terraform state,
getResourceInstancesAll crashed.
Check not only for rs.Instances[key] to exist, but also to have a
valid Current pointer.
This also includes a previously-missing test that verifies the behavior
described here, implemented as a planning context test for consistency
with how the other ignore_changes tests are handled.
Makre sure private data is maintained all the way to destroy. This
slipped through, since private data isn't used much for current
providers, except for timeouts.
Send Private data blob through ReadResource as well. This will allow for
extra flexibility for future providers that may want to pass data out of
band through to their resource Read functions.
The config is statically validated early on for structural issues, but
the provider can't validate any inputs that were unknown at the time.
Run ValidateResourceTypeConfig during Plan, so that the provider can
validate the final config values, including those interpolated from
other resources.
* core: don't panic in NodeAbstractResourceInstance References()
It is possible for s.Current to be nil. This was hard to reproduce, so
the root cause is still unknown, but we can guard against the symptom.
* add log statement
This is a "should never happen" case, because we shouldn't ever have
resources in the plan that aren't in the configuration, but since we've
got a report of a crash here (which went away before we got a chance to
debug it) here's just an extra guard to ensure that we'll still exit
gracefully in that case.
If we see this error crop up again in future, it'd be nice to gather a
full trace log so we can see what GraphNodeAttachResourceConfig did and
why it did not attach a configuration.
Previously we tried to short-circuit this if the schema version hadn't
changed and we were already using JSON serialization. However, if we
instead call into UpgradeResourceState every time we can let the provider
or the SDK do some general, systematic normalization and cleanup steps
without always requiring a schema version increase.
What exactly would be fixed up this way is for the SDK to decide, but for
example the SDK might choose to automatically delete from the state
anything that is no longer present in the schema, avoiding the need to
write explicit state migration functions for that common case where the
remedy is always the same.
The actual update logic is delegated to the provider/SDK intentionally so
that it can evolve over time and potentially differ depending on how
each SDK thinks about schema.
With the new ConfigModeAttr, we can now have complex structures come in
as attributes rather than blocks. Previously attributes were either
known, or unknown, and there was no reason to descend into them. We now
need to record the complete path to unknown values within complex
attributes to create a proper diff after shimming the config.
If a datasource's dependencies have planned changes, then we need to
plan a read for the data source, because the config may change once the
dependencies have been applied.
The count for a data resource can potentially depend on a managed resource
that isn't recorded in the state yet, in which case references to it will
always return unknown.
Ideally we'd do the data refreshes during the plan phase as discussed in
#17034, which would avoid this problem by planning the managed resources
in the same walk, but for now we'll just skip refreshing any data
resources with an unknown count during refresh and defer that work to the
apply phase, just as we'd do if there were unknown values in the main
configuration for the data resource.
Earlier on in the v0.12 development cycle we made the decision that the
validation walk should consider input values to always be unknown so that
validation is checking validity for all possible inputs rather than for
a specific set of inputs; checking for a specific set of inputs is the
responsibility of the plan walk.
However, we didn't implement that in the best way: we made the
"terraform validate" command force all of the input variables to unknown
but that was insufficient because it didn't also affect the implicit
validation walk we do as part of "terraform plan" and "terraform apply",
causing those to produce confusingly-different results.
Instead, we'll address the problem directly in the reference resolver code,
ensuring that all variable values will always be treated as an unknown
(of the declared type, so type checking is still possible) during any
validate walk, regardless of which command is running it.
We previously attempted to make the special diff apply behavior for nested
sets of objects work with attribute mode by totally discarding attribute
mode for all shims.
In practice, that is too broad a solution: there are lots of other shimming
behaviors that we _don't_ want when attribute mode is enabled. In
particular, we need to make sure that the difference between null and
empty can be seen in configuration.
As a compromise then, we will give all of the shims access to the real
ConfigMode and then do a more specialized fixup within the diff-apply
logic: we'll construct a synthetic nested block schema and then use that
to run our existing logic to deal with nested sets of objects, while
using the previous behavior in all other cases.
In effect, this means that the special new behavior only applies when the
provider uses the opt-in ConfigMode setting on a particular attribute,
and thus this change has much less risk of causing broad, unintended
regressions elsewhere.
When an operation fails, providers may return a null new value rather than
returning a partial state. In that case, we'd prefer to keep the old value
so that we stand the best chance of being able to retry on a subsequent
run.
Previously we were making an exception for the delete action, allowing
the result of that to be null even when an error is returned. In practice
that was a bad idea because it would cause Terraform to lose track of the
object even though it might not actually have been deleted.
Now we'll retain the old object even in the delete case. Providers can
still return partial new objects if they were able to partially complete
a delete operation, in which case we'll discard what we had before, but
if the result is null with errors then we'll assume the delete failed
entirely and so just keep the old state as-is, giving us the opportunity
to refresh it on the next run to see if anything actually happened after
all.
(This also includes a new resource in the test provider which isn't used
by the patch but was useful for some manual UX testing here, so I thought
I'd include it in case it's similarly useful in future, given how simple
its implementation is.)
In study of existing providers we've found a pattern we werent previously
accounting for of using a nested block type to represent a group of
arguments that relate to a particular feature that is always enabled but
where it improves configuration readability to group all of its settings
together in a nested block.
The existing NestingSingle was not a good fit for this because it is
designed under the assumption that the presence or absence of the block
has some significance in enabling or disabling the relevant feature, and
so for these always-active cases we'd generate a misleading plan where
the settings for the feature appear totally absent, rather than showing
the default values that will be selected.
NestingGroup is, therefore, a slight variation of NestingSingle where
presence vs. absence of the block is not distinguishable (it's never null)
and instead its contents are treated as unset when the block is absent.
This then in turn causes any default values associated with the nested
arguments to be honored and displayed in the plan whenever the block is
not explicitly configured.
The current SDK cannot activate this mode, but that's okay because its
"legacy type system" opt-out flag allows it to force a block to be
processed in this way anyway. We're adding this now so that we can
introduce the feature in a future SDK without causing a breaking change
to the protocol, since the set of possible block nesting modes is not
extensible.
For compatibility with documented patterns from existing providers we are
now allowing (via a pre-processing step) any attribute whose type is a
list-of-object or set-of-object type to optionally be assigned using one
or more blocks whose type is the attribute name.
The pre-processing functionality was implemented in previous commits but
we were not correctly detecting references within these blocks that are,
from the perspective of the primary schema, invalid. Now we'll use an
alternative implementation of variable detection that is able to apply the
same schema rewriting technique we used to implement the transform and
thus can find all of the references as if they were already in their
final locations.
When a resource type schema includes dynamically-typed attributes we can't
do any automatic conversion from flatmap to JSON because we don't know
how to interpret the keys that start with the dynamically-typed
attribute's prefix.
To work around that, we'll instead just ask the SDK to do a no-op upgrade
(current and target versions are the same) which will convert from flatmap
to JSON using the SDK's own logic as a side-effect.
This situation should rarely arise in real-world use, but it ends up being
very important for the helper/resource provider test harness because it
is forced to lower the state back to flatmap repeatedly after every step
in order to run legacy checking and processing code.
The hcldec package has no awareness of the dynamic block extension, so the
hcldec.Variables function misses any variables declared inside dynamic
blocks.
dynblock.VariablesHCLDec is a drop-in replacement for hcldec.Variables
that _is_ aware of dynamic blocks, returning all of the same variables
that hcldec would find naturally plus also any variables used inside
the dynamic block "for_each" and "labels" arguments and inside the
nested "content" block.
Our post-refresh safety check had the constraint and real type inverted,
so previously any refresh of a resource type with a dynamically-typed
attribute would fail this type check.
Also includes a small tweak to the error message from this check since the
old one was a little awkward to read in practice when the error is a
cty.PathError rendered with an attribute path prefix.
If a block is uneffected by diffs, keep the block count value regardless
of what it is. Blocks containing zero values will often be represented
by only the count value.
We are now allowing the legacy SDK to opt out of the safety checks we try
to do after plan and apply, and so in such cases the before/after values
in planned changes may be inconsistent with our usual rules.
To avoid adding lots of extra complexity to the diff renderer to deal with
these situations, instead we'll normalize the handling of nested blocks
prior to using these values.
In the long run it'd be better to do this normalization at the source,
immediately after we receive an object from a provider using the opt-out,
but we're doing this at the outermost layer for now to avoid risking
unintended impacts on other Terraform Core components when we're just
about to enter the beta phase of the v0.12.0 release cycle.
This uses the fixed "superset" schema from the main terraform package to
apply our standard expression mapping, with the exception of "type" where
interpolation sequences are not supported due to the type being evaluated
early to retrieve the schema for decoding the rest.
The init error was output deep in the backend by detecting a
special ResourceProviderError and formatted directly to the CLI.
Create some Diagnostics closer to where the problem is detected, and
passed that back through the normal diagnostic flow. While the output
isn't as nice yet, this restores the helpful error message and makes the
code easier to maintain. Better formatting can be handled later.
A provider may react to a create or update failing by returning error
diagnostics and a partially-updated or nil new value, in which case we
do not expect our AssertObjectCompatible consistency check to succeed: the
provider is just assumed to be doing the best it can to preserve whatever
partial outcome it was able to achieve.
However, if errors are accompanied with a nil new value after an update,
we'll assume that the provider is telling us it wasn't able to get far
enough to make any change at all, and so we'll retain the prior value in
state. This ensures that a provider can't cause an object to be forgotten
from the state just because an update failed.
RequiresReplace paths with IndexSteps that have been added or removed
may fail to apply against one of the two state values. Only error out if
the path cannot be applied to both values.
Prior to Terraform 0.12 there were certain behaviors we expected from
providers that were actually just details of the SDK and not part of the
enforced contract.
For 0.12 we're now codifying some of these behaviors explicitly via safety
checks in core, thus ensuring that all future providers will behave in a
consistent way that users can rely on.
Unfortunately, due to the hand-written nature of the mock provider
implementations we use in tests, they have been getting away with some
unusual behaviors that don't match our usual expectations, and our safety
checks now detect those as incorrect behaviors.
To address this, we make the minimal changes to each test to ensure that
its mock provider behaves in a consistent way, which requires that values
set in config be represented correctly in the plan and ultimately saved
in the new state, without any changes along the way. In particular, the
common testDiffFn implementation has historically used a number of special
hidden attributes to trigger special behaviors, and our new rules require
that these special settings propagate properly through the plan and into
the state.
Due to the inprecision of our shimming from the legacy SDK type system to
the new Terraform Core type system, the legacy SDK produces a number of
inconsistencies that produce only minor quirky behavior or broken
edge-cases. To retain compatibility with those existing weird behaviors,
the legacy SDK opts out of our safety checks.
The intent here is to allow existing providers to continue to do their
previous unsafe behaviors for now, accepting that this will allow certain
quirky bugs from previous releases to persist, and then gradually migrate
away from the legacy SDK and remove this opt-out on a per-resource basis
over time.
As with the apply-time safety check opt-out, this is reserved only for
the legacy SDK and must not be used in any new SDK implementations. We
still include any inconsistencies as warnings in the logs as an aid to
anyone debugging weird behavior, so that they can see situations where
blame may be misplaced in the user-visible error messages.
We've allowed the legacy SDK an opt-out from the post-apply safety checks,
but previously we produced only a generic warning message in that case.
Now instead we'll still run the safety checks, but report the results in
the logs instead of as error diagnostics.
This should allow developers who are debugging strange interactions
between buggy legacy providers to get better insight into what's going
on upstream in order to help explain what's going on when these problems
inevitably get caught by other downstream safety checks when trying to
make use of these invalid results.
We've been gradually adding safety checks of this sort throughout the
lifecycle to help ensure that buggy providers can't introduce
hard-to-diagnose downstream failures and misbehavior. This completes the
set by verifying during plan time that the provider has produced a plan
that actually achieves the goals defined in the configuration.
In particular, this catches the situation where a provider may incorrectly
override a value explicitly set in configuration, which avoids creating
confusion by betraying the reasonable user expectation that referencing an
explicitly-defined attribute will produce exactly the value shown in
configuration.
The helper/schema handling of lists loses empty string values, but
retains the correct count. Only re-count the values if the count is
missing entirely, and allow our shims to re-populate the zero values.
In an earlier commit we changed objchange.ProposedNewObject so that the
task of populating unknown values for attributes not known during apply
is the responsibility of the provider's PlanResourceChange method, rather
than being handled automatically.
However, we were also using objchange.ProposedNewObject to construct the
placeholder new object for a deferred data resource read, and so we
inadvertently broke that deferral behavior. Here we restore the old
behavior by introducing a new function objchange.PlannedDataResourceObject
which is a specialized version of objchange.ProposedNewObject that
includes the forced behavior of populating unknown values, because the
provider gets no opportunity to customize a deferred read.
TestContext2Plan_createBeforeDestroy_depends_datasource required some
updates here because its implementation of PlanResourceChange was not
handling the insertion of the unknown value for attribute "computed".
The other changes here are just in an attempt to make the flow of this
test more obvious, by clarifying that it is simulating a -refresh=false
run, which effectively forces a deferred read since we skip the eager
read that would normally happen in the refresh step.
Now that ProposedNewState uses null to represent Computed attributes not
set in the configuration, the provider must fill in the unknown value for
"computed" in its plan result.
It seems that this test was incorrectly updated during our bulk-fix after
integrating the HCL2 work, but it didn't really matter because the
ReadDataSource function isn't called in the happy path anyway. But to
make the intent clearer here, we also now make ReadDataSource return an
error if it is called, making it explicit that no call is expected.
Data resources do not have a plan/apply distinction, so it is never valid
for a data resource to produce unknown values in its result object.
Unknown values in the data resource _config_ cause us to postpone the read
altogether, so a data source never receives unknown values as input and
therefore may never produce unknown values as output.
The shim layer for the legacy SDK type system is not precise enough to
guarantee it will produce identical results between plan and apply. In
particular, values that are null during plan will often become zero-valued
during apply.
To avoid breaking those existing providers while still allowing us to
introduce this check in the future, we'll introduce a rather-hacky new
flag that allows the legacy SDK to signal that it is the legacy SDK and
thus disable the check.
Once we start phasing out the legacy SDK in favor of one that natively
understands our new type system, we can stop setting this flag and thus
get the additional safety of this check without breaking any
previously-released providers.
No other SDK is permitted to set this flag, and we will remove it if we
ever introduce protocol version 6 in future, assuming that any provider
supporting that protocol will always produce consistent results.
Previously we would allow providers to change anything about the planned
object value during apply, possibly returning an entirely-unrelated object
of the same type. In practice this led to some subtle bugs where a single
planned attribute value would change during apply and cause a downstream
failure due to a dependent resource now seeing input other than what
_it_ expected during plan.
Now we'll produce an explicit error message for this case which places the
blame with the correct party: the upstream resource that changed. Without
this, unexpected changes would often lead to the downstream resource
implementation being blamed in error message even though it was just
reacting to the change from upstream.
As with most errors during apply, we'll still save the updated value in
the state but we'll halt the walk to ensure that the unexpected value
cannot propagate further and cause the result to potentially diverge
greatly from the changeset shown in the plan.
Compared to Terraform 0.11, we expect to see this error in many of the
same cases we saw the "diffs didn't match during apply" error in earlier
versions, since it is likely that many errors of that sort were the result
of unexpected upstream changes being incorrectly blamed on the downstream
resource that then used the result.
Because Terraform Core has traditionally not checked that the final apply
result is consistent with what was planned, some of our apply tests were
producing inconsistent results.
Here we fix all of that so that they produce something compatible with
what they planned. This doesn't actually achieve anything in isolation,
but we're about to start enforcing this consistency in a subsequent
commit.
We were previously using cty.Path.Apply, which serves a similar purpose
but implements the more restrictive traversal behaviors down at the cty
layer. hcl.ApplyPath uses the same rules as HCL expressions and so ensures
consistent behavior with normal user expressions.
cty.Path.Apply also previously had a crashing bug (discussed in #20084)
that was causing a panic here. That has now been fixed in cty, but since
we're no longer using it here that's a moot point. The HCL traversing
implementation has been fuzz-tested and unit tested a lot more thoroughly
so should not run into the same crashers we saw with cty before.
When elements are removed from a list, all attributes may not be present
in the diff. Once the individual attributes diffs are applied, use the
length to truncate the flatmapped list to the correct length.
If there were no matching keys, and there was no diff at all, don't set
a zero count for the container. Normally Providers can't reliably detect
empty vs unset here, but there are some cases that worked.
Missing prefix in map recount. This generally passes tests since the
actual count should already be there and be correct, then ethe extra key
is ignored by the shims.
The previous version assumed the diff could be applied verbatim, and
only used the schema at the top level since diffs are "flat". This
turned out to not work reliably with nested blocks. The new Apply method
is driven completely by the schema, and handles nested blocks separately
from other collections.
In an ideal world, providers are supposed to respond to errors during
apply by returning a partial new state alongside the error diagnostics.
In practice though, our SDK leaves the new value set to nil for certain
errors, which was causing Terraform to "forget" the object altogether by
assuming that the provider intended to say "null".
We now adjust that assumption to apply only in the delete case. In all
other cases (including updates) we retain the prior state if the new
state is given as nil. Although we could potentially fix this in the SDK
itself, I expect this is a likely bug in other future SDKs for other
languages too, so this new assumption is a safer one to make to be
resilient to data loss when providers don't behave perfectly.
Providers that return both nil new value and no errors are considered
buggy, but unfortunately that applies to the mocks in many of our tests,
so for pragmatic reasons we can't generate an error for that case as we do
for other "should never happen" situations. Instead, we'll just retain the
prior value in the state so the user can retry.
There are a few constructs from 0.11 and prior that cause 0.12 parsing to
fail altogether, which previously created a chicken/egg problem because
we need to install the providers in order to run "terraform 0.12upgrade"
and thus fix the problem.
This changes "terraform init" to use the new "early configuration" loader
for module and provider installation. This is built on the more permissive
parser in the terraform-config-inspect package, and so it allows us to
read out the top-level blocks from the configuration while accepting
legacy HCL syntax.
In the long run this will let us do version compatibility detection before
attempting a "real" config load, giving us better error messages for any
future syntax additions, but in the short term the key thing is that it
allows us to install the dependencies even if the configuration isn't
fully valid.
Because backend init still requires full configuration, this introduces a
new mode of terraform init where it detects heuristically if it seems like
we need to do a configuration upgrade and does a partial init if so,
before finally directing the user to run "terraform 0.12upgrade" before
running any other commands.
The heuristic here is based on two assumptions:
- If the "early" loader finds no errors but the normal loader does, the
configuration is likely to be valid for Terraform 0.11 but not 0.12.
- If there's already a version constraint in the configuration that
excludes Terraform versions prior to v0.12 then the configuration is
probably _already_ upgraded and so it's just a normal syntax error,
even if the early loader didn't detect it.
Once the upgrade process is removed in 0.13.0 (users will be required to
go stepwise 0.11 -> 0.12 -> 0.13 to upgrade after that), some of this can
be simplified to remove that special mode, but the idea of doing the
dependency version checks against the liberal parser will remain valuable
to increase our chances of reporting version-based incompatibilities
rather than syntax errors as we add new features in future.
In prior versions of Terraform we permitted inconsistent use of indexes
in resource references, but in as of 0.12 the index usage must correlate
properly with whether "count" is set on the resource.
Since users are likely to have existing configurations with incorrect
usage, here we introduce some specialized error messages for situations
where we can detect such issues statically. This seems to cover all of the
common patterns we've seen in practice.
Some usage patterns will fall back on a less-helpful dynamic error here,
but no configurations coming from 0.11 can end up that way because 0.11
did not permit forms such as aws_instance.no_count[count.index].bar that
this validation would not be able to "see".
Our configuration upgrade tool also contains a fix for this already, but
it takes a more conservative approach of adding the index [1] rather than
[count.index] because it can't be sure (without human help) if correlation
of indices is what was intended.
Previously we used the native slash type for the host platform, but that
leads to issues if the same configuration is applied on both Windows and
non-Windows systems.
Since Windows supports slashes and backslashes, we can safely return
always slashes here and require that users combine the result with
subsequent path parts using slashes, like:
"${path.module}/foo/bar"
Previously the above would lead to an error on Windows if path.module
contained any backslashes.
This is not really possible to unit test directly right now since we
always run our tests on Unix systems and filepath.ToSlash is a no-op on
Unix. However, this does include some tests for the basic behavior to
verify that it's not regressed as a result of this change.
This will need to be reported in the changelog as a potential breaking
change, since anyone who was using Terraform _exclusively_ on Windows may
have been using expressions like "${path.module}foo\\bar" which they will
now need to update.
This fixes#14986.
We already catch indirect cycles through the normal cycle detector, but
we never create self-edges in the graph so we need to handle a direct
self-reference separately here.
The prior behavior was simply to produce an incorrect result (since the
local value wasn't assigned a new value yet).
This fixes#18503.
Older versions of terraform could save the backend hash number in a
value larger than an int.
While we could conditionally decode the state into an intermediary data
structure for upgrade, or detect the specific decode error and modify
the json, it seems simpler to just decode into the most flexible value
for now, which is a uint64.
Make sure that NodeDestroyableDataResource has a ResolvedProvider to
call EvalWriteState. This entails setting the ResolvedProvider in
concreteResourceDestroyable, as well as calling EvalGetProvider in
NodeDestroyableDataResource to load the provider schema.
Even though writing the state for a data destroy node should just be
removing the instance, every instance written sets the Provider for the
entire resource. This means that when scaling back a counted data
source, if the removed instances are written last, the data source will
be missing the provider in the state.
Validate should not require state or changes to be present. Break out
early when using evaluationStateData during walkValidate before checking
state or changes, to prevent errors when indexing resources that haven't
been expanded.
Both depends_on and ignore_changes contain references to objects that we
can validate.
Historically Terraform has not validated these, instead just ignoring
references to non-existent objects. Since there is no reason to refer to
something that doesn't exist, we'll now verify this and return errors so
that users get explicit feedback on any typos they may have made, rather
than just wondering why what they added seems to have no effect.
This is particularly important for ignore_changes because users have
historically used strange values here to try to exploit the fact that
Terraform was resolving ignore_changes against a flatmap. This will give
them explicit feedback for any odd constructs that the configuration
upgrade tool doesn't know how to detect and fix.
If an instance object in state has an earlier schema version number then
it is likely that the schema we're holding won't be able to decode the
raw data that is stored. Instead, we must ask the provider to upgrade it
for us first, which might also include translating it from flatmap form
if it was last updated with a Terraform version earlier than v0.12.
This ends up being a "seam" between our use of int64 for schema versions
in the providers package and uint64 everywhere else. We intend to
standardize on int64 everywhere eventually, but for now this remains
consistent with existing usage in each layer to keep the type conversion
noise contained here and avoid mass-updates to other Terraform components
at this time.
This also includes a minor change to the test helpers for the
backend/local package, which were inexplicably setting a SchemaVersion of
1 on the basic test state but setting the mock schema version to zero,
creating an invalid situation where the state would need to be downgraded.
There's no reason for a negative version, so by blocking it now we'll
ensure that none creep in.
The more practical short-term motivation for this is that we're still
using uint64 for these internally in some cases and so this restriction
ensures that we won't run into rough edges when converting from int64 to
uint64 at those boundaries until we later fix everything to use int64
consistently.
Previously we were making an invalid assumption in evaluating module call
references (like module.foo) that the module must exist, which is
incorrect for that particular case because it's a reference to a child
module, not to an object within the current module.
However, now that we have the mechanism for static validation of
references, we'll deal with this one there so it can be caught sooner.
That then makes the original assumption valid, though for a different
reason.
This is verified by two new context tests for validation:
- TestContext2Validate_invalidModuleRef
- TestContext2Validate_invalidModuleOutputRef
Previously we were fetching these from the provider but then immediately
discarding the version numbers because the schema API had nowhere to put
them.
To avoid a late-breaking change to the internal structure of
terraform.ProviderSchema (which is constructed directly all over the
tests) we're retaining the resource type schemas in a new map alongside
the existing one with the same keys, rather than just switching to
using the providers.Schema struct directly there.
The methods that return resource type schemas now return two arguments,
intentionally creating a little API friction here so each new caller can
be reminded to think about whether they need to do something with the
schema version, though it can be ignored by many callers.
Since this was a breaking change to the Schemas API anyway, this also
fixes another API wart where there was a separate method for fetching
managed vs. data resource types and thus every caller ended up having a
switch statement on "mode". Now we just accept mode as an argument and
do the switch statement within the single SchemaForResourceType method.
In the initial move to HCL2 we started relying only on full expression
evaluation to catch attribute errors, but that's not sufficient for
resource attributes in practice because during validation we can't know
yet whether a resource reference evaluates to a single object or to a
list of objects (if count is set).
To address this, here we reinstate some static validation of resource
references by analyzing directly the reference objects, disregarding any
instance index if present, and produce errors if the remaining subsequent
traversal steps do not correspond to items within the resource type
schema.
This also allows us to produce some more specialized error messages for
certain situations. In particular, we can recognize a reference like
aws_instance.foo.count, which in 0.11 and prior was a weird special case
for determining the count value of a resource block, and offer a helpful
error showing the new length(aws_instance.foo) usage pattern.
This eventually delegates to the static traversal validation logic that
was added to the configschema package in a previous commit, which also
includes some specialized error messages that distinguish between
attributes and block types in the schema so that the errors relate more
directly to constructs the user can see in the configuration.
In future we could potentially move more of the checks from the dynamic
schema construction step to the static validation step, but resources
are the reference type that most needs this immediately due to the
ambiguity caused by the instance indexing syntax. We can safely refactor
other reference types to be statically validated in later releases.
This is verified by two pre-existing context validate tests which we
temporarily disabled during earlier work (now re-enabled) and also by a
new validate test aimed specifically at the special case for the "count"
attribute.
These overly-general interfaces are no longer used anywhere, and their
presence in the important-sounding semantics.go file was a distracting
red herring.
We'd previously replaced the one checker in here with a simple helper
function for checking input variables, and that's arguably more at home
with all of the other InputValue functionality in variables.go, and that
allows us to remove semantics.go (and its associated test file) altogether
and make room for some forthcoming new files for static validation.
It's possible that a computed collection could be handled by the
attribute name, rather than the index count value.
Use a new testDiffFn for some tests, which don't work with the old
function that can't determine `computed` without the schema.
Comments here indicate that this was erroneously returning an error but
we accepted it anyway to get the tests passing again after other work.
The tests over in the "terraform" package agree that cancelling should be
a successful outcome rather than an error.
I think that cancelling _should_ actually be an error, since Terraform did
not complete the operation it set out to complete, but that's a change
we'd need to make cautiously since automation wrapper scripts may be
depending on the success-on-cancel behavior.
Therefore this just fixes the command package test to agree with the
Terraform package tests and adds some FIXME notes to capture the potential
that we might want to update this later.
Since the state models can't preserve unknown values, we need to rely on the plan to persist these until the effective configuration can be fully resolved during the apply phase.
If plan and apply are both run against the same context then we still have
the planned output values in memory while we're doing the apply walk, so
we need to make sure to update them along with the state as we learn the
final known values of each output.
There were actually two different bugs here:
- We weren't removing any existing planned change for an output when
setting a new one. In retrospect a map would've been a better data
structure for the output changes, rather than a slice to mimic what we
do for resource instance objects, but for now we'll leave the structures
alone and clean up as needed. (The set of outputs should be small for
any reasonable configuration, so the main impact of this is some ugly
code in RemoveOutputChange.)
- RemoveOutputChange itself had a bug where it was iterating over the
resource changes rather than the output changes. This didn't matter
before because we weren't actually using that function, but now we are.
This fix is confirmed by restoring various existing context apply tests
back to passing again.
This new source type should be used for variables loaded from .tfvars files that were explicitly passed as command line arguments (e.g. -var-file=foo.tfvars)
Just as when we resolve single output values we must check to see if there
is a planned new value for an output before using the value in state,
because the planned new value might contain unknowns that can't be
represented directly in the state (and would thus be incorrectly returned
as null).
Resource timeouts were a separate config block, but did not exist in the
resource schema. Insert any defined timeouts when generating the
configshema.Block so that the fields can be accepted and validated by
core.
This ensures more HCL1/HIL-like behaviors when dealing with nested blocks
that are not set in the configuration, which is important for
compatibility with helper/schema's validation logic.
There are several steps here and a number of them can include reaching out
to remote servers or executing local processes, so it's helpful to have
some trace logs to better narrow down causes of errors and hangs during
this step.
The legacy test wasn't really testing anything useful, just the plan
behavior with no diff in a module returned a module with no diff.
There's no reason to convert this to the new plans, since there's no
legacy behavior to match.
In the reorganization of the data source read code we missed the fact that
the plan phase must _always_ generate a read data diff, never directly
read, because the state generated during plan is throwaway.
This only matters in the -refresh=false case, since normally refresh has
already taken care of this, but that is still an important case, covered
by the TestContext2Apply_dataBasic test.
When we're being asked to destroy everything, we ideally want to end up
with a totally empty state. Normally we will conservatively keep around
the "husks" of resources (what's left after all of the instances have been
destroyed) unless they are configured without count or for_each, but in
this special case we'll prune those out.
The implication of this is that in "weird" expression contexts that happen
before the next "terraform plan", such as evaluation in
"terraform console" or expressions in data resources and provider blocks
that get evaluated during the refresh walk, we will see these results
as unknown rather than as empty lists of objects. We accept that weirdness
for now because in a future release we are likely to remove "refresh" as
a separate walk anyway, doing all of that work during the plan walk where
we can ensure that these values are properly re-populated before trying
to use them.
Some mock objects will not have any mock behavior configured for the
GetSchema method, so we should just return a valid-but-empty schema in
that case, rather than panicking as we did before.
These are similar to the symbols of the same name in package "providers".
terraform.ProvisionerFactory is now an alias for provisioners.Factory, so
we can defer updating all of the existing users of it.
There are still some errors left, because our expression evaluator now
does more validation than before and so we'll need to (in a subsequent
commit) actually use a fixture configuration for these tests so that the
validations will allow the expressions to be validated.
This test was occasionally failing due to a missing graph edge causing it
to be non-deterministic.
The graph edge was missing because our standard schema doesn't quite match
the config fixture and so the reference checker was not finding the "blah"
argument on aws_instance.a.
This change also includes an 100ms pause for the b node just to make this
potential race more likely to hit the "wrong" ordering when the graph is
not complete.
The weird special-case behaviors of testDiffFn were interfering with the
outcome of this test, but we don't actually need any of those special
behaviors here so we'll use a very simple PlanResourceChangeFn
implementation instead, just letting the built-in merge behavior in core
take care of it.
Since we started using experimental Go Modules our editor tooling hasn't
been fully functional, apparently including format-on-save support. This
is a catchup to get everything back straight again.
One of the assumptions this test was checking no longer holds: we don't
retain outputs for non-root modules in persistent state, because we can
always re-populate these on a future run by evaluating the configuration.
The old-fashioned formatting of "Provider" on the shimmed state here was
causing the state upgrade code to treat it like a module-relative address,
rather than an absolute address as we now expect.
This error message appears in a situation that is often confusing for
users, since the connection between resources and their providers in the
state is not something we draw attention to in the user experience of
Terraform.
This new error message tries to be a bit clearer about what the user must
do to resolve it. It's still not perfect since it doesn't cover the
variant of this problem where an entire module containing a provider block
and resources has been removed at the same time, but since there isn't
an easily-summarizable way to continue in that state this will need to
do for the moment, until we find a way to file off that rough edge in
the workflow.
Configuration blocks can contain sensitive information, so better to just
talk about them by reference (in this case, source location) rather than
embedding them directly, to reduce the risk of accidental information
leakage through sharing logs for debug purposes.
This was already updated for the new state types earlier, but since then
we adjusted how deposed instances are written out in the old string
representation of state, and so this regressed.
These tests show that we're still not fully pruning modules from the state
in all cases, due to us not being able to fully prune out modules that
contain resources with count set after a destroy, but this is no worse
than before so we'll accept it for now and address this separately later.
A module heading without "<no state>" but also without any instances
listed is the rendering for a module containing a resource that has no
instances, since our old string rendering of state doesn't represent
resources themselves.
We previously had mechanisms to clean up only individual instance states,
leaving behind empty resource husks in the state after they were all
destroyed.
This takes care of it in the "orphan" case. It does not yet do it in the
"terraform destroy" or "terraform plan -destroy" cases because we don't
have anywhere to record in the plan that we're actually destroying and so
the resource configurations should be ignored and _everything_ should be
cleaned. We'll let the state be not-quite-empty in that case for now,
since it doesn't really hurt; cleaning up orphans is the main case because
the state will live on afterwards and so leftover cruft will accumulate
over the course of many changes.
Since this one has a situation where there are two deposed objects for
the same instance at once, we can't rely on comparing state strings: they
are not deterministic when multiple deposed objects are present.
Instead, we do more surgical comparisons directly on the state model
objects, which is not quite as robust but still gets us the main stuff we
care about here, to be followed up by another checkStateString further
down for the final state.
Tainted objects now also remember which provider they belong to (via the
resource state they are attached to) and so the stringified state output
here is slightly different.
This was relying on a no-longer-valid mechanism for accessing the "count"
value from a resource block. The original issue this test was written to
cover is not really such a sharp edge anymore, since the length is taken
from the state during apply rather than from configuration, but it's still
a good case to cover.
This test was incorrectly amended on the first pass to create a
configuration snapshot from the step zero configuration, rather than the
step one configuration that the save plan is built from.
Along with that, it needed various other minor updates to match with
details that have shifted:
- "id" and "type" attributes must be explicitly declared in schema
- template_file.parent has count = 1, which now causes it to get an index
and be a list where before it did not.
If we don't set it, we end up creating an invalid plan where the destroy
changes don't have a provider address set, which then later fails
decoding when round-tripped through a planfile.
This also includes some extra safety checks in EvalDiff and
EvalDiffDestroy so that we can catch this bug sooner in future.
This change is verified by
TestContext2Apply_plannedDestroyInterpolatedCount, which is now passing.
A plan file without a backend set is not valid, but at the level we're
testing in this package we don't really care about backends so we'll just
set a default one if the caller doesn't set something more specific, and
then we'll just ignore it completely when reading back.
It doesn't make sense to ignore_changes when the prior value is null,
since we have to create something before we can ignore changes to it.
This change is verified by TestContext2Apply_ignoreChangesWildcard.
This test was relying on the feature of the old provisioner API that gave
provisioners full access to the instance state of what they were
provisioning.
We no longer do this, and so instead the ApplyFn must distingish the
instances using a value from the provisioner's own configuration.
This can happen if unknown values in the plan actually end up being
identical to the prior values once resolved. In that case, we'll just make
no change at all.
This is verified by TestContext2Apply_ignoreChangesWithDep.
Due to a quirk in how testDiffFn constructs its diff, compute doesn't
actually get included in the final result anymore under the new shims.
This result is still correct, nonetheless.
The prior behavior being asserted by this test was incorrect, since the
configuration calls for there to be two instances of the resource at the
end.
We also now assert on the generated plan since it's important to verify
that we are indeed planning to replace the zeroth instance but not the
first instance (which doesn't yet exist).
We now include attribute changes in destroy diffs, so the expected output
of this test includes these changes.
Also includes a fix to legacyDiffComparisonString to actually sort the
attribute changes by name in the rendered diff.
The testDiffFn doesn't include "compute" in the diff it produces and so
it no longer appears in the shimmed output.
This is just a quirk of this weird mock implementation; real providers
always copy all of the values from thec config into the diff before adding
in any other changes.
The only reasonable usage of these methods is for them to run concurrently
with other methods, so we mustn't hold a lock to do this work. For tests
that deal with stopping, it's the test's own responsibility to deal with
any concurrency issues that arise from their StopFns running concurrently
with other mock functions.
Since stopping is a rather complex mechanism that relies on correct
handling of concurrency, it's handy to have these logs here to debug when
things don't happen in quite the right order.
Comment out an out of date CBD test. The test no longer works due to
the CBD status being checked during plan, but the test case may still
have some value which we can review later.
update the text in the CBD transformer to reflect the
s/ancestor/descendent/ change.
There does not appear to be any real reason that these Schemas fields are
not exported, and exporting them makes it possible to directly construct
Schemas for tests without pulling in an entire context.
Now that we populate a resource-level state pre-emptively for a resource,
before we apply the instances of that resource, it is no longer true that
this produces only one resource state, but the test below (comparing the
string version of the state) already captures the expected behavior and
so this additional check was redundant anyway.
This test was incorrectly updated to the new hook API; it should've been
recording the individual resource state values passed to the hook, not
the original state as a whole (which is not yet updated at the point
when the hook is called).
Reduce the graphs, as they are too ungainly to compare without
reduction. We also depend on reduction in all use cases, so we should be
testing the graph as we use it anyway.
We can't catch invalid attributes in validate at the moment, because the
lack of count information causes the references to return unknown. Make
sure they fail in plan, and mark the validate tests to fix later.
Previously we used a single plan action "Replace" to represent both the
destroy-before-create and the create-before-destroy variants of replacing.
However, this forces the apply graph builder to jump through a lot of
hoops to figure out which nodes need it forced on and rebuild parts of
the graph to represent that.
If we instead decide between these two cases at plan time, the actual
determination of it is more straightforward because each resource is
represented by only one node in the plan graph, and then we can ensure
we put the right nodes in the graph during DiffTransformer and thus avoid
the logic for dealing with deposed instances being spread across various
different transformers and node types.
As a nice side-effect, this also allows us to show the difference between
destroy-then-create and create-then-destroy in the rendered diff in the
CLI, although this change doesn't fully implement that yet.
Remove a test that is no longer needed, since provider must be
explicitly defined for orphaned modules, and is covered in other context
tests.
Udpate a test fixture to better represent the origianl missing map
issue, since the ability to detect nil now made the old test invalid.
This test was relying on an odd definition of "invalid" from prior
versions of Terraform, where it would be an error to access an attribute
that exists as part of the resource type schema but the provider
implementation neglected to set it.
This was an implementation detail though, caused by the flatmap
representation and the fact that core itself didn't have access to the
schema to do static validation. Now the original usage returns a null
value because the "value" attribute is defined, and so we need a new
test fixture that accesses an attribute that is not defined in the schema
at all.
I misunderstood the logic here on the first pass of porting to the new
provider and state types: EvalUndeposeState is supposed to return the
deposed object back to being current again, so we can undo the deposing
in the case where the create leg fails.
If we don't do this, we end up leaving the instance with no current object
at all and with its prior object deposed, and then the later destroy
node deletes that deposed object, leaving the user with no object at all.
For safety we skip this restoration if there _is_ a new current object,
since a failed create can still produce a partial result which we need
to keep to avoid losing track of any remote objects that were successfully
created.
We now correctly prune out empty modules after destroying everything
inside them, so we need to update this expectation string to match the new
behavior, rather than before when it was actually describing a buggy
result.