HyFI Input Parameters

This document provides detailed explanations for all HyFI parameters in config_single_TEMPLATE.json and config_multi_TEMPLATE.json.


Table of Contents


Metadata

Configuration metadata for documentation and tracking purposes.

Parameter

Type

Description

workflow_name

string

Name for this analysis workflow (e.g., “St. Leonard Sequence”)

workflow_version

string

Version number for tracking configuration changes (e.g., “1.0.0”)

created_date

string

ISO 8601 formatted date when config was created (e.g., “2025-11-11T00:00:00”)

description

string

Optional description of the analysis


Global Settings

Settings that apply to the entire workflow.

Common Settings (Single & Multi-Sequence)

Parameter

Type

Default

Description

output_directory

string

"./hyfi_output"

Directory where all output files will be saved. For single-sequence: all outputs go to this directory. For multi-sequence: individual sequence results go to subdirectories within this path

log_level

string

"INFO"

Logging verbosity. Options: "DEBUG" (verbose), "INFO" (standard), "WARNING" (errors only), "ERROR", "CRITICAL"

Multi-Sequence Only Settings

Parameter

Type

Default

Description

parallel_processing

boolean

false

Enable parallel processing of per-sequence analysis using multiple CPU cores. When true, sequences are dispatched to a ProcessPoolExecutor with up to max_workers workers. Per-sequence verbose output is written to <sequence>/<sequence>_processing.log instead of the terminal. Fault IDs are non-continuous in parallel mode (each sequence reserves a block of 10 000 IDs) but are always unique.

max_workers

integer

4

Maximum number of parallel worker processes for multi-sequence processing. Used when parallel_processing: true is implemented. Range: 1-16 depending on system CPU cores. Recommended: 4-8 for typical systems

save_individual_results

boolean

true

Save intermediate and individual sequence analysis results to disk. When true, creates subdirectories for each sequence with full analysis outputs; when false, only final merged results are kept (saves disk space but loses per-sequence details)


Input Data

Configuration for input earthquake catalog files. See the Quickstart guide for information about preparing your data and using the built-in ECOS parser.

File Requirements

Parameter

Type

Required

Description

hypocenter_file

string

Yes

Path to hypocenter catalog file (e.g., “data_examples/A0_data.csv”)

hypocenter_separator

string

Yes

Column separator for hypocenter file. Options: "," (CSV), "\t" (TSV), ";" (semicolon)

focal_mechanism_file

string or null

No

Path to focal mechanism catalog. Set to null if not available

focal_mechanism_separator

string

No

Column separator for focal mechanism file. Options: ",", "\t", ";"

Required Hypocenter Columns

All hypocenter files must contain these 17 core columns (order not strict):

Column

Type

Description

ID

string

Event identifier (must be unique)

LAT

float

Latitude (degrees, WGS84)

LON

float

Longitude (degrees, WGS84)

DEPTH

float

Depth (kilometers below sea level, positive downward)

X

float

Easting coordinate (meters, typically Swiss LV95/CH1903+)

Y

float

Northing coordinate (meters, typically Swiss LV95/CH1903+)

Z

float

Vertical position (meters, negative = below datum)

EX

float

X-coordinate uncertainty (meters)

EY

float

Y-coordinate uncertainty (meters)

EZ

float

Z-coordinate uncertainty (meters)

YR

int

Year (4-digit, e.g., 2024)

MO

int

Month (1-12)

DY

int

Day of month (1-31)

HR

int

Hour (0-23)

MI

int

Minute (0-59)

SC

float

Second (0-59.999)

MAG

float

Magnitude (any scale: ML, Mw, mb, etc.)

Required Focal Mechanism Columns (Optional)

If provided, focal mechanism files must contain all hypocenter columns plus these 10 additional columns:

Column

Type

Description

A

int

Quality flag (1 or 2 = valid focal mechanism, 0 or null = no solution)

Strike1

float

Strike angle of nodal plane 1 (degrees, 0-360)

Dip1

float

Dip angle of nodal plane 1 (degrees, 0-90)

Rake1

float

Rake angle of nodal plane 1 (degrees, -180 to 180)

Strike2

float

Strike angle of nodal plane 2 (degrees, 0-360)

Dip2

float

Dip angle of nodal plane 2 (degrees, 0-90)

Rake2

float

Rake angle of nodal plane 2 (degrees, -180 to 180)

Pazim

float

P-axis azimuth (degrees, 0-360)

Pdip

float

P-axis dip (degrees, 0-90)

Tazim

float

T-axis azimuth (degrees, 0-360)

Tdip

float

T-axis dip (degrees, 0-90)

Q

int

Focal mechanism quality rating (1-5 scale, or null if unavailable)

Type

string

Event type classification (e.g., “normal”, “strike-slip”, “reverse”, “uncertain”)

Data Source Conversions

ECOS Catalog Parser

If you have ECOS ConsolidatedMergeCat and/or AssociateFM files, use the built-in parser:

hyfi parse-ecos --hypo ECOS_Merge_Bull+AbsRel+DDC+DDR_20260116.ConsolidatedMergeCat.csv \
                 --focals ECOS_Merge_Bull+AbsRel+DDC+DDR_20260116.AssociateFM.csv

Or with just the filenames (if running from the directory containing them):

hyfi parse-ecos --hypo ECOS_Merge_Bull+AbsRel+DDC+DDR_20260116.ConsolidatedMergeCat.csv --focals ECOS_Merge_Bull+AbsRel+DDC+DDR_20260116.AssociateFM.csv

This automatically converts ECOS pipe-delimited format to standard HyFI CSV format. See Quickstart: Using Your Own Data for detailed examples.

Other Formats

For earthquake catalogs from other sources:

  1. Prepare CSV file with required columns (order not strict, case-insensitive column names accepted)

  2. Specify the correct separator in configuration (",", "\t", or ";")

  3. Ensure data types match (integers for YR/MO/DY/HR/MI, floats for coordinates/magnitudes, etc.)

  4. Verify coordinate systems match your project (typically WGS84 for lat/lon, local projection for X/Y)

CSV Example

ID	LAT	LON	DEPTH	X	Y	Z	EX	EY	EZ	YR	MO	DY	HR	MI	SC	MAG
EV001	45.5234	9.7654	8.5	2683450.5	1247850.3	-8500	100	120	150	2024	3	15	14	30	45.2	3.2
EV002	45.5245	9.7665	9.2	2683460.2	1247860.1	-9200	110	130	160	2024	3	15	14	31	22.8	2.8

Data Validation

HyFI automatically validates input files during workflow initialization. The validator checks:

  • ✓ All required columns are present

  • ✓ Column data types are correct

  • ✓ Coordinate values are within reasonable ranges

  • ✓ Temporal values (YR, MO, DY, HR, MI) are valid calendar dates

  • ✓ Magnitude values are plausible (typically -2 to 8)

  • ✓ Focal mechanism angles are within valid ranges (0-360° for azimuths, 0-90° for dips, -180 to 180° for rakes)

Validation errors will be reported with specific guidance on how to fix the issues.


Fault Network

Parameters for 3D fault network reconstruction using NN search and PCA.

Core Network Parameters

Parameter

Type

Default

Description

monte_carlo_simulations

integer

1000

Number of Monte Carlo iterations for uncertainty quantification. Range: 1-inf. Higher values = more robust statistics but slower computation. Typical: 1000

search_radius_meters

float or string

100.0

Spatial search radius for neighbor detection in meters. Use numeric value or "auto" for automatic optimization. Typical range: 50-1000 m

search_time_window_hours

float or string

9999999

Temporal search window for neighbor detection in hours. Use numeric value or "auto" for automatic optimization. Large value (e.g. 9999999h) includes all events

magnitude_type

string

"ML"

Magnitude type to use for rupture radius calculation. Options: "ML" (local magnitude), "Mw" (moment magnitude)

Outlier Detection

Parameter

Type

Default

Description

remove_outliers

boolean

false

Enable outlier detection and removal before fault network reconstruction (clean-up of hypocenters)

outlier_method

string

"DBSCAN"

Outlier detection algorithm. Options: "DBSCAN" (density-based, good for distinct clusters, default), "LOF" (local outlier factor, good for varying density), "IForest" (isolation forest, robust and fast)

lof_n_neighbors

integer or null

null

LOF-specific: Number of neighbors to consider. null = auto-tuned based on dataset size, or integer 10-50

lof_contamination

string or float

"auto"

LOF-specific: Expected outlier proportion. "auto" or float 0.01-0.5 (1%-50%)

if_n_estimators

integer

100

IForest-specific: Number of trees in the forest. Range: 50-200. Higher = more robust but slower

if_max_samples

string or integer

"auto"

IForest-specific: Number of samples per tree. "auto" = auto-tuned, or integer. Lower = faster but less accurate

if_contamination

float

0.05

IForest-specific: Expected outlier proportion. Float 0.01-0.5 (5% = 0.05, conservative default)

if_random_state

integer

42

IForest-specific: Random seed for Isolation Forest reproducibility

Note: Events with valid focal mechanism data (A=1 or A=2) are protected from outlier removal.

Focal Mechanism Constraints

Parameter

Type

Default

Description

use_focal_constraints

boolean

false

Use focal mechanism data to constrain fault plane selection. Requires focal_mechanism_file to be specified. When enabled, the algorithm selects between the two nodal planes based on consistency with neighboring events

Automatic Parameter Optimization

Parameter

Type

Default

Description

auto_optimize_parameters

boolean

false

Enable automatic optimization of search_radius_meters and search_time_window_hours. When enabled, ignores manual values for these parameters

optimization_method

string

"optuna"

Optimization algorithm. Options: "optuna" (TPE sampler, recommended, default), "grid_search" (thorough grid search), "pareto" (multi-objective), "heuristic" (fast heuristic)

optimization_use_adaptive_weights

boolean

true

When true, automatically adjusts weights based on number of focal mechanisms (reduces focal weight for datasets with few focals, increases recovery importance) and dataset density (adjusts recovery expectations for sparse datasets). When false, uses fixed weights (original behavior).

optimization_random_state

integer

42

Random seed for optimization reproducibility

optimization_plot_results

boolean

false

Generate visualization plots of optimization results

optimization_r_nn_range

array or null

[50, 1000]

Search radius range for optimization [min_meters, max_meters]. Use null for automatic range determination

optimization_dt_nn_range

array or null

[100, 50000]

Time window range for optimization [min_hours, max_hours]. Use null for automatic range determination

Grid Search Specific Parameters

Parameter

Type

Default

Description

optimization_grid_points

integer

25

Grid resolution for grid_search method. Total evaluations = grid_points². Range: 10-50. Higher = more thorough but slower (25 = 625 evaluations)

Optuna Optimization Specific Parameters

Parameter

Type

Default

Description

optimization_n_trials

integer

50

Total number of trials for Optuna optimization. Range: 30-500. Higher = more thorough

optimization_sampler

string

"tpe"

Optuna sampling algorithm. Options: "tpe" (Tree-structured Parzen Estimator, recommended), "cmaes" (CMA-ES), "random" (Random sampling)

optimization_n_startup_trials

integer

10

Number of random trials before sampler-specific optimization starts. Range: 5-20

optimization_early_stopping_rounds

integer or null

null

Early stopping: Stop if no improvement for N consecutive trials. null = disabled. Recommended: 10-20 for n_trials=50, 20-30 for n_trials=200. Saves computation time

optimization_early_stopping_threshold

float

0.0001

Minimum improvement to be considered significant for early stopping. Range: 1e-5 to 1e-3. Lower = more conservative

Pareto Multi-Objective Optimization Specific Parameters

Parameter

Type

Default

Description

optimization_pareto_sampler

string

"nsga2"

Pareto optimization sampler. Options: "nsga2" (NSGA-II, recommended), "nsga3" (NSGA-III), "random"

optimization_pareto_population

integer

50

Population size for evolutionary algorithms. Range: 30-100


Model Validation

Validation of reconstructed fault planes using focal mechanism data.

Parameter

Type

Default

Description

enabled

boolean

true

Enable model validation module. Requires focal_mechanism_file

Validation Parameters

Parameter

Type

Default

Description

check_magnitude_consistency

boolean

true

Verify magnitude consistency between hypocenter and focal mechanism catalogs

check_location_consistency

boolean

true

Verify spatial location consistency between catalogs

maximum_distance_km

float

1.0

Maximum distance (km) between hypocenter and focal mechanism location for matching

maximum_magnitude_difference

float

0.2

Maximum magnitude difference for matching hypocenters with focal mechanisms


Auto Classification

Automatic classification of fault structures based on orientation and spatial clustering.

Parameter

Type

Default

Description

enabled

boolean

true

Enable automatic classification module

Orientation Clustering Parameters

Parameter

Type

Default

Description

auto_determine_clusters

boolean

true

Automatically determine optimal number of orientation clusters using silhouette analysis

max_clusters

integer

8

Maximum number of orientation clusters to consider when auto-determining. Range: 2-15

number_of_clusters

integer

2

Fixed number of orientation clusters. Used only if auto_determine_clusters=false

clustering_algorithm

string

"vmf_soft"

Clustering algorithm. Options: "vmf_soft" (von Mises-Fisher soft, recommended), "vmf_hard" (von Mises-Fisher hard), "skm" (spherical k-means)

rotate_poles_before_analysis

boolean

true

Rotate fault normal vectors to same hemisphere before clustering. Recommended for sub-vertical faults. Ensures all poles point to similar direction

convergence_tolerance

float

1e-6

Convergence tolerance for clustering algorithms (1e-4 for SphericalKMeans, 1e-6 for VonMisesFisher)

maximum_iterations

integer

300

Maximum iterations for clustering algorithms

Spatial Sub-Clustering Parameters

All spatial sub-clustering parameters are organized under the spatial_sub_clustering object:

Parameter

Type

Default

Description

enable_spatial_clustering

boolean

true

Enable spatial sub-clustering within orientation clusters to identify separate fault structures

spatial_clustering_method

string

"dbscan"

Spatial clustering method. Options: "dbscan" (density-based, recommended), "kmeans" (k-means), "hierarchical" (agglomerative)

min_events_per_cluster

integer

10

Minimum number of events required to form a valid spatial cluster

use_fault_plane_points_for_clustering

boolean

false

Use enhanced point cloud (fault plane surface points) instead of hypocenters for spatial clustering. Improves spatial resolution when enabled. When true, generates synthetic points on fault plane surfaces

fault_plane_point_density_meters

float

10.0

Target spacing in meters between points on fault plane circumference. Range: 5-50m. Lower values = denser point cloud, more computation. Typical: 10-25m

fault_plane_radius_interval_meters

float

10.0

Spacing in meters between concentric circles when generating fault plane points. Range: 5-50m. Lower values = more circles and higher resolution. Typical: 10-25m

use_anisotropic_eps

boolean

false

Use anisotropic (direction-dependent) distance metrics for DBSCAN. When true, different distance thresholds are applied in-plane vs. out-of-plane, improving detection of elongated fault structures

in_plane_eps_meters

float

500.0

DBSCAN eps parameter in meters for in-plane distances (when using anisotropic metric). Range: 100-1000m. Lower = tighter clusters

out_of_plane_eps_meters

float

50.0

DBSCAN eps parameter in meters for out-of-plane distances (when using anisotropic metric). Range: 10-100m. Typically much smaller than in_plane_eps_meters

anisotropic_min_samples

integer

5

DBSCAN min_samples parameter when using anisotropic metric. Minimum number of neighbors required. Range: 3-20. Higher = more conservative clustering


Stress Analysis

Fault stress analysis and failure assessment using regional stress field.

Parameter

Type

Default

Description

enabled

boolean

true

Enable fault stress analysis module

Regional Stress Field Parameters

Parameter

Type

Default

Description

use_shapefile

boolean

false

Enable spatially-varying stress field from shapefile. When true, stress parameters are read from shapefile instead of using fixed values below

shapefile_path

string/null

null

Path to shapefile (.shp) with spatially-varying stress field polygons. Only used if use_shapefile is true. Shapefile must have columns: s1_trend, s1_plunge, s3_trend, s3_plunge, R. Example: "data_examples/Stressfield/CH_stressfield_Kastrup.shp"

sigma1_trend_degrees

float/null

null

σ₁ (maximum principal stress) azimuth/trend in degrees. Range: 0-360, measured clockwise from North. Used as fixed value (if use_shapefile=false) or fallback value. Required if use_shapefile=false

sigma1_plunge_degrees

float/null

null

σ₁ plunge in degrees. Range: 0-90 (0=horizontal, 90=vertical). Used as fixed value (if use_shapefile=false) or fallback value. Required if use_shapefile=false

sigma3_trend_degrees

float/null

null

σ₃ (minimum principal stress) azimuth/trend in degrees. Range: 0-360. Used as fixed value (if use_shapefile=false) or fallback value. Required if use_shapefile=false

sigma3_plunge_degrees

float/null

null

σ₃ plunge in degrees. Range: 0-90. Used as fixed value (if use_shapefile=false) or fallback value. Required if use_shapefile=false

stress_shape_ratio

float/null

null

Stress shape ratio R = (σ₂-σ₃)/(σ₁-σ₃). Range: 0-1 (0=uniaxial extension, 0.5=σ₂ midway, 1=isotropic/uniaxial compression). Used as fixed value (if use_shapefile=false) or fallback value. Required if use_shapefile=false

Note: σ₂ (intermediate principal stress) is automatically calculated from σ₁, σ₃, and the stress shape ratio.

Spatially-Varying Stress Field: When use_shapefile is true and shapefile_path is provided, the algorithm calculates the center coordinate (mean X, Y) of all hypocenters and queries the stress field values from the polygon containing this point.

Mechanical Properties Parameters

Parameter

Type

Default

Description

pore_pressure_mpa

float

0.0

Pore fluid pressure in MPa. Range: 0-50 MPa typical. Reduces effective normal stress on faults

friction_coefficient

float

0.75

Coulomb friction coefficient μ. Range: 0.6-0.85 typical (Byerlee’s law: ~0.6-1.0). Controls fault reactivation potential

Calculated Outputs:

  • Effective normal stress (Sn_eff)

  • Shear stress (Tau)

  • Rake (slip direction)

  • Instability index (I)

  • Slip tendency

  • Dilation tendency


Visualization

Visualization and export settings for analysis results.

Parameter

Type

Default

Description

enabled

boolean

true

Enable visualization module

Basic Visualization Parameters

Parameter

Type

Default

Description

generate_3d_model

boolean

true

Generate interactive 3D Plotly HTML visualization of complete fault network

generate_stereonet

boolean

true

Generate stereonet (lower-hemisphere equal-area projection) showing fault plane orientations

Fault Surface Interpolation Parameters

Parameter

Type

Default

Description

enable_plane_interpolation

boolean

true

Enable Poisson surface reconstruction to interpolate continuous fault surfaces from point clouds

enable_mesh_stress

boolean

true

Calculate stress parameters (Sn_eff, Tau, rake, sliptend, dilatend) for each mesh face. Requires stress_analysis enabled

mesh_subdivisions

integer

2

Number of mesh subdivision iterations (0-3). Each iteration quadruples face count. Loop subdivision maintains smoothness. Creates denser meshes than increasing poisson_depth

poisson_depth

integer

2

Poisson reconstruction octree depth. Range: 4-12. Higher = more detail but can introduce noise. Recommended: 2-3 for smooth base, then use mesh_subdivisions for density

density_threshold

float

0.01

Minimum density threshold for surface reconstruction. Range: 0.01-0.9. Lower = includes sparse regions, higher = only dense regions

max_distance_factor

float

2.5

Maximum distance factor for point-to-surface association. Range: 1.0-5.0. Higher = more permissive association

min_fault_planes_for_interpolation

integer

10

Minimum number of fault planes required in a cluster to attempt Poisson surface reconstruction. Clusters with fewer fault planes are skipped

fault_plane_radius_interval_meters

float

10.0

Spacing in meters between concentric circles when generating synthetic fault plane points.

fault_plane_point_density_meters

float

10.0

Target spacing in meters between points on fault plane circles. Increase to reduce computation time.

3D Export Parameters

Parameter

Type

Default

Description

export_vtp

boolean

true

Export all results to VTP (VTK PolyData) format for visualization in ParaView/Blender

export_obj

boolean

false

Export meshes as Wavefront OBJ files for 3D modeling software (MOVE, Blender, MeshLab, etc.)

Exported VTP Files:

  • hypocenters.vtp - All hypocenter points with attributes

  • enhanced_pointcloud.vtp - Enhanced fault plane point cloud

  • rupture_planes_combined.vtp - All rupture plane meshes

  • focal_planes_combined.vtp - All focal mechanism planes (if focal constraints enabled)

  • interpolated_surfaces_*.vtp - Interpolated fault surfaces (if interpolation enabled)


Multi-Sequence Processing

Parameters for catalog segmentation and merging procedures that are required specifically for the multi-sequence workflow in addition to the single-sequence parameters.

See Workflows Guide for complete multi-sequence workflow documentation.

Note: Multi-sequence global settings (parallel_processing, max_workers, save_individual_results) are documented in Global Settings → Multi-Sequence Only Settings.

Step 1: Load Input Data

Data loading in the multi-sequence workflow is configured in the step_1_load_data workflow step. This step loads and prepares the earthquake catalog for segmentation:

"step_1_load_data": {
  "hypocenter_file": "data_examples/SECOS_20250305_HyFI.csv",
  "hypocenter_separator": ",",
  "focal_mechanism_file": "data_examples/SECOS_20250305_FM_HyFI.csv",
  "focal_mechanism_separator": ","
}

Parameter

Type

Required

Description

hypocenter_file

string

Yes

Path to hypocenter catalog file (e.g., “data_examples/SECOS_20250305_HyFI.csv”). This is the full earthquake catalog to be segmented into sequences

hypocenter_separator

string

Yes

Column separator for hypocenter file. Options: "," (CSV), "\t" (TSV), ";" (semicolon). Must match the format of your input file

focal_mechanism_file

string or null

No

Path to focal mechanism catalog. Set to null if not available. Optional but recommended for improved fault plane selection via focal constraints

focal_mechanism_separator

string

No

Column separator for focal mechanism file. Options: ",", "\t", ";". Only needed if focal_mechanism_file is specified

Required Columns (see Input Data section for complete column descriptions):

  • Hypocenter: YR, MO, DY, HR, MI, SC, LAT, LON, Z, X, Y, EX, EY, EZ, ID, MAG (or ML/Mw)

  • Focal Mechanism (if used): ID, Strike1, Dip1, Rake1, Strike2, Dip2, Rake2, A (optional)

Step 2: Segmentation Configuration

Multi-sequence segmentation is configured in the step_2_catalog_segmentation workflow step:

"step_2_catalog_segmentation": {
  "enabled": true,
  "segmentation_steps": [
    {
      "step_name": "Class_A",
      "method": "dbscan",
      "features": ["spatial"],
      "cluster_dimension": "3d",
      "dbscan_eps": 350.0,
      "dbscan_min_samples": 10,
      "min_cluster_size": 20,
      "outlier_handling": "next_step"
    }
  ],
  "final_outlier_handling": "keep"
}

To achieve hierarchical multi-scale clustering, multiple segmentation steps can be defined.

Clustering Method

Parameter

Type

Options

Description

method

string

"dbscan", "hdbscan"

Clustering algorithm to use for segmentation

Method Comparison:

  • DBSCAN: Density-based spatial clustering with fixed distance threshold.

  • HDBSCAN: Hierarchical DBSCAN that adapts to varying densities.

Feature Selection

Parameter

Type

Options

Description

features

array

["spatial"], ["spatial", "temporal"]

Features to use for clustering. Options: ["spatial"] (Spatial clustering only using X, Y, Z coordinates. Groups events by location regardless of time), ["spatial", "temporal"] (Spatiotemporal clustering using X, Y, Z, and time. Groups events that are close in both space and time)

Cluster Dimension

Parameter

Type

Options

Description

cluster_dimension

string

"3d", "2d"

Spatial dimensionality for clustering. Options: "3d" (Full 3D spatial clustering using X, Y, Z coordinates. Standard for most fault analysis), "2d" (Horizontal clustering using only X, Y coordinates (ignores depth). Useful for analyzing lateral distribution or when depth uncertainty is high)

DBSCAN Parameters

Parameter

Type

Default

Description

step_name

string

Required

Label for this segmentation step (e.g., “Class_A”, “Fine_Scale”, “Primary”). Used in output directory names

dbscan_eps

float

Required

Maximum distance between events in the same cluster (meters). Smaller values = tighter clusters. Typical range: 100-1000 m

dbscan_min_samples

integer

10

Minimum number of events required to form a dense cluster core. Higher values = stricter clustering. Typical range: 5-20

min_cluster_size

integer

10

Minimum number of events required to keep a cluster after segmentation. Clusters with fewer events are treated as outliers. Typical range: 10-50

outlier_handling

string

"next_step"

How to handle events not assigned to any cluster. Options: "next_step" (pass to next segmentation step), "keep" (create outlier sequence), "discard" (remove from analysis)

DBSCAN-Specific Parameters

Parameter

Type

Default

Description

dbscan_metric

string

"euclidean"

Distance metric for DBSCAN. Options: "euclidean" (standard Euclidean distance, recommended), "manhattan" (L1 distance), "chebyshev" (Chebyshev distance). Only used with method: "dbscan"

HDBSCAN Parameters

Parameter

Type

Default

Description

hdbscan_min_cluster_size

integer

15

Minimum number of samples in a cluster. Range: 5-50. Lower values = more clusters but potential noise. Higher values = fewer, larger clusters. Only used with method: "hdbscan"

hdbscan_min_samples

integer or null

null

Minimum number of samples in a neighborhood for a point to be considered a core point. null = same as hdbscan_min_cluster_size. Use integer 3-20 for custom values. Only used with method: "hdbscan"

Cluster Geometry Filtering

Parameter

Type

Default

Description

filter_by_aspect_ratio

boolean

false

Enable filtering of clusters based on their geometric aspect ratio (shape). When true, clusters are evaluated by elongation: aspect_ratio = max_eigenvalue / min_eigenvalue. Clusters with aspect ratio below min_aspect_ratio threshold (blob-shaped or compact) are reclassified as noise (-1)

min_aspect_ratio

float

1.5

Minimum aspect ratio threshold for keeping a cluster. Range: 1.0-5.0+. When filter_by_aspect_ratio: true, clusters with aspect_ratio < min_aspect_ratio are discarded.

Temporal Clustering Parameters

Parameter

Type

Default

Description

temporal_window_days

integer

30

Time window for grouping events (days). Range: 1-365. Events within this time window are grouped together regardless of spatial separation. Only used with method: "temporal" or when "temporal" is in features

Temporal Method: Groups events by time windows, independent of location. Useful for identifying earthquake swarms and sequences that occur within specific time periods.

Spatial-Temporal Parameters

Parameter

Type

Default

Description

spatial_weight

float

0.7

Weight for spatial vs temporal features in combined analysis. Range: 0.0-1.0. Use 0.7 = 70% spatial/30% temporal (recommended), 0.5 = equal weight, 0.9 = mostly spatial. Only used with method: "spatial_temporal" or when both "spatial" and "temporal" are in features

Spatial-Temporal Method: Combines spatial and temporal clustering using weighted features. Better for identifying space-time patterns in seismicity.

Outlier Handling (Per-Step)

Parameter

Type

Default

Description

process_outliers

boolean

true

Whether to process outlier events in subsequent segmentation steps. When true, events not assigned to clusters are passed to the next step. When false, outliers are discarded immediately

outlier_handling

string

"next_step"

Strategy for handling unassigned events in this step. Options: "next_step" (pass to next segmentation step, recommended for hierarchical), "keep" (create separate outlier sequence), "discard" (remove from analysis). Applies only to this specific step

Hierarchical Workflow Strategy:

  • Use outlier_handling: "next_step" for all intermediate steps

  • Use outlier_handling: "keep" or "discard" only for the final step

  • This creates a hierarchical multi-scale segmentation where fine-scale clusters are found first, then broader clusters are identified from remaining events

Segmentation-Only Mode

Parameter

Type

Default

Description

segmentation_only

boolean

false

If true, skip fault analysis (step 3) and export VTP files directly after segmentation. Useful for testing and validating segmentation parameters before running full fault plane analysis. Exports individual VTP files per sequence and a combined VTP file with all sequences

Output Files:

  • segmented_sequences_{step_name}_sequence_{ID}.vtp - Individual VTP file per sequence

  • segmented_sequences_{step_name}_combined.vtp - Combined VTP with all sequences (different colors per cluster)

  • segmentation_summary.json - JSON file with statistics (cluster counts, event counts per sequence)

Final Outlier Handling

Parameter

Type

Options

Description

final_outlier_handling

string

"keep", "discard"

How to handle events not clustered after all segmentation steps are complete. "keep" creates a Z_outliers sequence directory with unclustered events, "discard" removes them from analysis and subsequent per-sequence processing

max_outlier_ratio

float

0.3

Maximum acceptable ratio of outlier events relative to total catalog. Range: 0.0-1.0 (0.3 = 30%). If exceeded, a warning is logged but processing continues. Used for validation/diagnostics

Step 3: Per-Sequence Analysis Configuration

After segmentation, HyFI core analysis is applied independently to each identified sequence in the step_3_per_sequence_analysis workflow step:

"step_3_per_sequence_analysis": {
  "description": "Apply HyFI core analysis to each sequence identified in step_2",
  "fault_network": { ... },
  "model_validation": { ... },
  "auto_classification": { ... },
  "stress_analysis": { ... },
  "visualization": { ... }
}

This step applies all single-sequence analysis modules (Fault Network, Model Validation, Auto-Classification, Stress Analysis, and Visualization) to each sequence independently. The configuration is identical to the single-sequence workflow parameters documented in the respective sections:

Step 4: Merge and Export Configuration

After per-sequence analysis, individual results are merged and exported in the step_4_merge_and_export workflow step:

"step_4_merge_and_export": {
  "enabled": true,
  "description": "Merge VTP files and export combined results",
  "merge_vtp_files": true,
  "merged_output": {
    "merged_vtp_path": "./output/HyFI_Database/HyFI_Database_vtp/",
    "export_merged_csv": true,
    "export_summary_statistics": true
  }
}

VTP File Merging

Parameter

Type

Default

Description

merge_vtp_files

boolean

true

Merge individual VTP files from all sequences into combined VTP files. Creates one merged file for each analysis layer (hypocenters, rupture planes, focal planes, interpolated surfaces, slip vectors, etc.). All sequences are combined into single files with metadata tracking which sequence each point/geometry belongs to

Merged VTP Output Files:

  • hypocenters_ALL.vtp - All hypocenter points with cluster attribution

  • rupture_planes_ALL.vtp - All rupture plane meshes from all sequences

  • faults_ALL.vtp - Combined fault plane compilation

  • focal_planes_ALL.vtp - All focal mechanism planes (if constraints enabled)

  • interpolated_surfaces_ALL.vtp - All interpolated fault surfaces

Merged Output Configuration

Parameter

Type

Default

Description

merged_outputmerged_vtp_path

string

Required

Directory path for merged VTP files. Example: "./output/HyFI_Database/HyFI_Database_vtp/". Merged VTP files are saved to this location for combined 3D visualization

export_merged_csv

boolean

true

Export enriched merged CSV catalog with all analysis results from all sequences. Output file: HyFI_Database/merged_catalog_enriched.csv. Combines hypocenters + fault plane parameters + auto-classification results. Useful for GIS import and further analysis

export_summary_statistics

boolean

true

Export summary statistics CSV with aggregate metrics per sequence. Output file: HyFI_Database/summary_statistics.csv. One row per sequence with event counts, magnitude range, spatial extent, and mean orientations

Output Directory Structure

output_directory/
├── HyFI_Database/                               # Main export folder
│   ├── merged_catalog_enriched.csv              # All events + analysis results
│   ├── summary_statistics.csv                   # Per-sequence summary
│   ├── HyFI_database_metadata.csv               # Fault system metadata
│   ├── HyFI_database_focal_mechanisms.csv       # Focal mechanism compilation
│   └── HyFI_Database_vtp/                       # Merged VTP files
│       ├── hypocenters_merged.vtp
│       ├── rupture_planes_merged.vtp
│       ├── faults_merged.vtp
│       ├── focal_planes_merged.vtp
│       └── interpolated_surfaces_merged.vtp
├── A1/                                          # Individual sequence outputs
│   ├── 3D_model.html
│   ├── A1_data.csv
│   ├── vtp_export/
│   └── ...
├── A2/
├── B1/
└── Z_outliers/                                  # Unclustered events (if kept)

Happy fault imaging! 🎉