Commit Graph

144 Commits

Author SHA1 Message Date
Eason Zhao
4a86ded3f5 fix: alert manager config 2025-10-23 01:00:53 -04:00
Eason Zhao
de23dff37e feat: alert manager config update 2025-10-21 16:20:57 -04:00
128a85daee update infra-auth-retriver 2025-09-30 16:19:21 +08:00
0e9afd2e4f add grafana init config 2025-09-30 16:12:27 +08:00
73fa737653 update grafana admin password 2025-09-29 18:09:31 +08:00
62a76e26ac feat: add node 2025-09-15 12:09:30 +08:00
b3134726b2 feat: add node 2025-09-15 11:22:14 +08:00
zhenyus
b54ad6a1b0 chore(starrocks): increase fe resources to 2c4g and limits to 4c8g
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-09-10 11:10:57 +08:00
zhenyus
ba78fbc007 Add resource requests and limits to Kafka cluster configuration, and update KafkaUser to reference a new secret for password management. Also, enhance KafkaUser ACLs and MongoDB connector configuration for full document change streams.
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-09-05 09:40:49 +08:00
zhenyus
aa46147e33 Add SCRAM-SHA-512 authentication to Kafka cluster configuration in freeleaps-data-platform manifests.
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-09-02 11:47:58 +08:00
zhenyus
d90e0f2692 Remove Flink and Metabase manifests from the freeleaps-data-platform directory, including CRDs, RBAC configurations, deployments, and associated resources. This cleanup streamlines the project by eliminating unused components.
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-09-02 10:38:40 +08:00
zhenyus
604af2dffb Remove Jenkins and Kubernetes overall dashboards from the monitoring system manifests.
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-09-01 00:37:14 +08:00
zhenyus
494caa9e80 Add additional worker nodes for freeleaps-data-platform in inventory.ini
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-08-31 23:53:19 +08:00
zhenyus
d7c072ee6a Remove deprecated Kafka and StarRocks configurations, including README, storage classes, and Vertical Pod Autoscaler files. This cleanup prepares for a more streamlined deployment process.
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-08-21 15:52:39 +08:00
Nicolas
f6c5aa12e4 deploy metabase 2025-08-21 11:07:48 +08:00
Nicolas
c9f681e44b deploy flink 2025-08-21 10:50:07 +08:00
Nicolas
f6f464dbae rewrite readme about starrocks 2025-08-20 18:53:25 +08:00
Nicolas
44f8a10431 Deploy the StarRocks Operator and use the Operator to deploy the HA StarRocks cluster 2025-08-20 18:50:45 +08:00
Nicolas
e7ec6a4258 Installed Strimzi Kafka Operator version 0.45.0
Three Kafka nodes + Three ZooKeeper nodes
Can tolerate 1 node failure
3 replicas distributed across different nodes
Use Azure Disk SSD SCRAM-SHA-512 + ACLs
2025-08-20 17:44:55 +08:00
Nicolas
31f959f7a9 feat: enable log collection for prod environment
- Enable logIngest for chat and freeleaps services in prod
- Add Loki datasource to Grafana for prod environment
- Configure Loki log retention policy (30 days)
- Enable table manager for automatic log cleanup
2025-08-13 09:41:25 +08:00
zhenyus
7a9c695c9e ci(bump): bump reconciler image version for alpha to snapshot-9f1a2bc
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-08-04 15:59:50 +08:00
zhenyus
3988ff13a8 ci(bump): update reconciler image version for alpha to 1.0.2
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-07-31 15:30:16 +08:00
zhenyus
c2d2fa6345 fix: update Jenkins token in gitea webhook configuration
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-07-24 16:51:35 +08:00
jingyao1991
4d871e25c4 Merge pull request 'add Tips section with service auth commands and links' (#1) from ice-feature into master
Reviewed-on: https://gitea.freeleaps.mathmast.com/freeleaps/freeleaps-ops/pulls/1
Reviewed-by: jingyao1991 <jingyao1991@noreply.gitea.freeleaps.mathmast.com>
2025-07-04 03:50:13 +00:00
29107247b1 docs:add Tips section with service auth commands and links 2025-07-04 11:19:04 +08:00
zhenyus
dca5cffa55 fix(flink): update resource requests and limits for jobmanager and taskmanager
- Adjusted CPU and memory requests and limits for both jobmanager and taskmanager to optimize resource allocation.
- Commented out the resourcesPreset parameter for clarity in configuration.

Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-07-04 11:03:26 +08:00
zhenyus
9c07783780 feat(kafka, pinot, star-rocks): update configurations and resource limits across multiple components
- Updated Kafka configuration to specify Kubernetes version and API versions.
- Enabled Vertical Pod Autoscaler (VPA) for Pinot and adjusted resource limits for CPU and memory.
- Removed obsolete certificate configuration for Pinot.
- Enhanced StarRocks values.yaml with comprehensive configurations for deployment, including service specifications and resource requests/limits.
- Increased timeout settings in production values for Freeleaps to improve service resilience.

Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-06-26 23:04:03 +08:00
zhenyus
a3b3b3f12f feat(inventory): add new worker node configuration for cost reduction in inventory.ini
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-06-24 14:54:04 +08:00
zhenyus
db590f3f27 refactor: update gitea-webhook-ambassador Dockerfile and configuration
- Changed the build process to include a web UI build stage using Node.js.
- Updated Go build stage to copy web UI files to the correct location.
- Removed the main.go file as it is no longer needed.
- Added SQLite database configuration to example config.
- Updated dependencies in go.mod and go.sum, including new packages for JWT and SQLite.
- Modified .gitignore to include new database and configuration files.

Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-06-10 16:00:52 +08:00
zhenyus
c8b68afc75 feat: Update Pinot configuration and RBAC rules
- Enhanced the Pinot Helm chart values.yaml with comprehensive configurations for controller, broker, server, minion, and zookeeper components.
- Added support for pod disruption budgets and custom resource definitions in RBAC rules.
- Introduced a new script for managing Kubernetes service port forwarding, allowing users to easily forward, stop, and list active services.
- Updated helm repository list to ensure proper access to necessary charts.

Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-05-20 16:00:32 +08:00
zhenyus
db0cd26f4b feat: update RBAC configurations for data platform and mathmast roles
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-05-12 10:56:58 +08:00
zhenyus
b7c11d2829 feat: update RBAC configurations and add Jenkinsfile for aml-services
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-05-12 09:56:54 +08:00
Joe
0c55ce90fa Merge branch 'refs/heads/master' into dev_zhengyang 2025-05-10 15:56:44 +08:00
Joe
134caeaeb2 feat: add starrocks to repo list 2025-05-10 15:48:29 +08:00
Joe
2d7b58dad7 Merge branch 'dev_zhengyang' of ssh.dev.azure.com:v3/freeleaps/freeleaps-ops/freeleaps-ops into dev_zhengyang 2025-05-09 17:53:34 +08:00
Joe
f4df870452 fix: update freeleaps-data-platform 2025-05-09 17:52:53 +08:00
zhenyus
f41973befc ci(bump): update freeleaps-cluster-authenticator version to 0.0.3-20250509 and add refresh-auth command
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-05-09 15:25:14 +08:00
zhenyus
2f7128a51c feat: update namespaces and add RBAC roles for freeleaps data platform and monitoring systems
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-05-09 13:10:13 +08:00
Zheng Yang
51cbfbef07 Deleted kafka-monitoring-table.yaml 2025-05-09 02:16:08 +00:00
Joe
a7025081a1 feat: add freeleaps-data-platform 2025-05-09 10:14:16 +08:00
Joe
b2b1fd274f feat: add freeleaps-data-platform 2025-05-09 10:07:00 +08:00
zhenyus
15dd1fba0b fix(opentelemetry): update resource attributes in distributor and log transformation for improved metadata extraction
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-21 17:43:08 +08:00
zhenyus
473f5cea54 fix(opentelemetry): update log transformation to use resource attributes for application and environment
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-21 17:17:40 +08:00
zhenyus
c106c9a624 fix(loki): update resource_attributes to use regex for indexing labels
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-21 17:05:53 +08:00
zhenyus
222f5ee0fb feat(opentelemetry): add structured metadata support and update log transformation logic
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-21 16:16:03 +08:00
zhenyus
67c4772407 fix: update imagePullPolicy to 'Always' for chat, backend, and frontend services; change branch name from 'master' to 'main' in configmap
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-18 14:36:35 +08:00
zhenyus
eedb1cefc7 fix: update CA injection annotations and webhook service URLs for OpenTelemetry operator
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-16 05:49:54 +08:00
zhenyus
a0d88d9507 Add OpenTelemetry Collector configuration for log ingestion
- Introduced a new OpenTelemetryCollector resource in the Helm chart.
- Configured filelog receiver to ingest logs based on specified patterns.
- Added processors for Kubernetes attributes and resource metadata.
- Set up Loki exporter for log forwarding with appropriate labels.
- Configured logging verbosity and defined log processing pipelines.

Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-16 05:45:40 +08:00
zhenyus
decca8e7a1 fix: update labels for Fluent Bit resources to ensure correct identification
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-16 00:46:39 +08:00
zhenyus
358f131809 Add Fluent Bit configuration for log collection
- Introduced Fluent Bit resources including FluentBit, Parser, Output, FluentBitConfig, and ClusterInput.
- Configured default resource requests and limits for Fluent Bit.
- Set up JSON parser with customizable time key and format.
- Established output forwarding to Fluentd service in the logging system.
- Enabled conditional deployment based on the `fluentbit.enabled` value in Helm chart.

Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-16 00:18:16 +08:00