- Added a parameter in
dagster.yaml that can be used to increase the time that Dagster waits when spinning up a gRPC server before timing out. For more information, see https://docs.dagster.io/deployment/dagster-instance#code-servers. - Added a new graphQL field
assetMaterializations that can be queried off of a DagsterRun field. You can use this field to fetch the set of asset materialization events generated in a given run within a GraphQL query. - Docstrings on functions decorated with the
@resource decorator will now be used as resource descriptions, if no description is explicitly provided. - You can now point
dagit -m or dagit -f at a module or file that has asset definitions but no jobs or asset groups, and all the asset definitions will be loaded into Dagit. AssetGroup now has a materialize method which executes an in-process run to materialize all the assets in the group.AssetGroups can now contain assets with different partition_defs.- Asset materializations produced by the default asset IO manager,
fs_asset_io_manager, now include the path of the file where the values were saved. - You can now disable the
max_concurrent_runs limit on the QueuedRunCoordinator by setting it to -1. Use this if you only want to limit runs using tag_concurrency_limits. - [dagit] Asset graphs are now rendered asynchronously, which means that Dagit will no longer freeze when rendering a large asset graph.
- [dagit] When viewing an asset graph, you can now double-click on an asset to zoom in, and you can use arrow keys to navigate between selected assets.
- [dagit] The “show whitespace” setting in the Launchpad is now persistent.
- [dagit] A bulk selection checkbox has been added to the repository filter in navigation or Instance Overview.
- [dagit] A “Copy config” button has been added to the run configuration dialog on Run pages.
- [dagit] An “Open in Launchpad” button has been added to the run details page.
- [dagit] The Run page now surfaces more information about start time and elapsed time in the header.
- [dagster-dbt] The dbt_cloud_resource has a new
get_runs() function to get a list of runs matching certain paramters from the dbt Cloud API (thanks @kstennettlull!) - [dagster-snowflake] Added an
authenticator field to the connection arguments for the snowflake_resource (thanks @swotai!). - [celery-docker] The celery docker executor has a new configuration entry
container_kwargs that allows you to specify additional arguments to pass to your docker containers when they are run.
- Fixed an issue where loading a Dagster repository would fail if it included a function to lazily load a job, instead of a JobDefinition.
- Fixed an issue where trying to stop an unloadable schedule or sensor within Dagit would fail with an error.
- Fixed telemetry contention bug on windows when running the daemon.
- [dagit] Fixed a bug where the Dagit homepage would claim that no jobs or pipelines had been loaded, even though jobs appeared in the sidebar.
- [dagit] When filtering runs by tag, tag values that contained the
: character would fail to parse correctly, and filtering would therefore fail. This has been fixed. - [dagster-dbt] When running the “build” command using the dbt_cli_resource, the run_results.json file will no longer be ignored, allowing asset materializations to be produced from the resulting output.
- [dagster-airbyte] Responses from the Airbyte API with a 204 status code (like you would get from /connections/delete) will no longer produce raise an error (thanks @HAMZA310!)
- [dagster-shell] Fixed a bug where shell ops would not inherit environment variables if any environment variables were added for ops (thanks @kbd!)
- [dagster-postgres] usernames are now urlqouted in addition to passwords
- The MySQL storage implementations for Dagster storage is no longer marked as experimental.
run_id can now be provided as an argument to execute_in_process.- The text on
dagit’s empty state no longer mentions the legacy concept “Pipelines”. - Now, within the
IOManager.load_input method, you can add input metadata via InputContext.add_input_metadata. These metadata entries will appear on the LOADED_INPUT event and if the input is an asset, be attached to an AssetObservation. This metadata is viewable in dagit.
- Fixed a set of bugs where schedules and sensors would get out of sync between
dagit and dagster-daemon processes. This would manifest in schedules / sensors getting marked as “Unloadable” in dagit, and ticks not being registered correctly. The fix involves changing how Dagster stores schedule/sensor state and requires a schema change using the CLI command dagster instance migrate. Users who are not running into this class of bugs may consider the migration optional. root_input_manager can now be specified without a context argument.- Fixed a bug that prevented
root_input_manager from being used with VersionStrategy. - Fixed a race condition between daemon and
dagit writing to the same telemetry logs. - [dagit] In
dagit, using the “Open in Launchpad” feature for a run could cause server errors if the run configuration yaml was too long. Runs can now be opened from this feature regardless of config length. - [dagit] On the Instance Overview page in
dagit, runs in the timeline view sometimes showed incorrect end times, especially batches that included in-progress runs. This has been fixed. - [dagit] In the
dagit launchpad, reloading a repository should present the user with an option to refresh config that may have become stale. This feature was broken for jobs without partition sets, and has now been fixed. - Fixed issue where passing a stdlib
typing type as dagster_type to input and output definition was incorrectly being rejected. - [dagster-airbyte] Fixed issue where AssetMaterialization events would not be generated for streams that had no updated records for a given sync.
- [dagster-dbt] Fixed issue where including multiple sets of dbt assets in a single repository could cause a conflict with the names of the underlying ops.
- [helm] Added configuration to explicitly enable or disable telemetry.
- Added a new IO manager for materializing assets to Azure ADLS. You can specify this IO manager for your AssetGroups by using the following config:
`from dagster import AssetGroup
from dagster_azure import adls2_pickle_asset_io_manager, adls2_resource
asset_group = AssetGroup(
[upstream_asset, downstream_asset],
resource_defs={"io_manager": adls2_pickle_asset_io_manager, "adls2": adls2_resource}
)`
- Added ability to set a custom start time for partitions when using
@hourly_partitioned_config , @daily_partitioned_config, @weekly_partitioned_config, and @monthly_partitioned_config - Run configs generated from partitions can be retrieved using the
PartitionedConfig.get_run_config_for_partition_key function. This will allow the use of the validate_run_config function in unit tests. - [dagit] If a run is re-executed from failure, and the run fails again, the default action will be to re-execute from the point of failure, rather than to re-execute the entire job.
PartitionedConfig now takes an argument tags_for_partition_fn which allows for custom run tags for a given partition.
- Fixed a bug in the message for reporting Kubernetes run worker failures
- [dagit] Fixed issue where re-executing a run that materialized a single asset could end up re-executing all steps in the job.
- [dagit] Fixed issue where the health of an asset’s partitions would not always be up to date in certain views.
- [dagit] Fixed issue where the “Materialize All” button would be greyed out if a job had SourceAssets defined.
- Updated resource docs to reference “ops” instead of “solids” (thanks @joe-hdai!)
- Fixed formatting issues in the ECS docs
- Added IO manager for materializing assets to GCS. You can specify the GCS asset IO manager by using the following config for
resource_defs in AssetGroup:
`from dagster import AssetGroup, gcs_pickle_asset_io_manager, gcs_resource
asset_group = AssetGroup(
[upstream_asset, downstream_asset],
resource_defs={"io_manager": gcs_pickle_asset_io_manager, "gcs": gcs_resource}
)`
- Improved the performance of storage queries run by the sensor daemon to enforce the idempotency of run keys. This should reduce the database CPU when evaluating sensors with a large volume of run requests with run keys that repeat across evaluations.
- [dagit] Added information on sensor ticks to show when a sensor has requested runs that did not result in the creation of a new run due to the enforcement of idempotency using run keys.
- [k8s] Run and step workers are now labeled with the Dagster run id that they are currently handling.
- If a step launched with a StepLauncher encounters an exception, that exception / stack trace will now appear in the event log.
- Fixed a race condition where canceled backfills would resume under certain conditions.
- Fixed an issue where exceptions that were raised during sensor and schedule execution didn’t always show a stack trace in Dagit.
- During execution, dependencies will now resolve correctly for certain dynamic graph structures that were previously resolving incorrectly.
- When using the forkserver start_method on the multiprocess executor, preload_modules have been adjusted to prevent libraries that change namedtuple serialization from causing unexpected exceptions.
- Fixed a naming collision between dagster decorators and submodules that sometimes interfered with static type checkers (e.g. pyright).
- [dagit] postgres database connection management has improved when watching actively executing runs
- [dagster-databricks] The databricks_pyspark_step_launcher now supports steps with RetryPolicies defined, as well as
RetryRequested exceptions.
- Docs spelling fixes - thanks @antquinonez!
- [dagit] Fixed issue where sensors could not be turned on/off in dagit.
- Fixed a bug with direct op invocation when used with
funcsigs.partial that would cause incorrect InvalidInvocationErrors to be thrown. - Internal code no longer triggers deprecation warnings for all runs.