【생물정보학】 xFuse의 이해 및 실행

xFuse의 이해 및 실행 (ref)

추천글 : 【생물정보학】 생물정보학 분석 목차

1. 개요 [본문]

2. step 1. forward scheme [본문]

3. step 2. inference [본문]

4. step 3. prediction [본문]

5. 프로그램 실행 [본문]

1. 개요 [목차]

⑴ 목적 : 저해상도의 ST 라이브러리를 고해상도의 ST 라이브러리로 만드는 것

① 전체 스팟의 개수는 Visium 프로토콜 기준으로 최대 4992개에 불과함 : 상당히 저해상도라고 할 수 있음

⑵ 배경 이론

① 이산확률이론

② 연속확률이론

③ CNN(convoluted neural network)

⑶ 가정

① I : 가우시안 분포를 따른다고 가정

② X : 음이항 분포를 따른다고 가정

2. step 1. forward scheme [목차]

⑴ 변수 1. 이미지 데이터 입력

① n : 각 스팟을 나타내는 변수

② I_n : histological image data of each section n

③ (x, y) : pixel coordinates

④ c : image channels

⑵ 변수 2. ST 데이터 입력

① n : 각 스팟을 나타내는 변수

② X_n : spatial expression data of each section n

③ m : metagene

⑶ 변수 3. 출력

① s_n : pixel-wise scaling factor

② a_n : matagene activity

③ μ_n : image distribution mean

④ σ_n : standard deviation

⑤ X : super-resolved expression to the observed expression X̃

⑷ 변수 4. 모델

① G : convolutional generator network. U-net과 비슷하게 설계

② Z : latent tissue state

③ θ : learnable parameter

④ r_ngxy : number of failures before stopping for each n, g, x, and y

⑤ p_ng : success probability for each n and g

⑥ L : weight matrix

⑦ t_g, u_g : gene-specific baseline

⑧ E, F : fixed effects to control for condition-wise batch effect

⑨ β_n : row vector of concatenated indicator variable

⑸ 수식화

① 9번 수식은 super-resolved expression X와 observed expression X̃를 연결시킴

3. step 2. inference [목차]

⑴ 정의 : observed expression X̃과 이미지 I를 가지고 Z_n, L, E, F를 알아내는 과정

⑵ 변수 정의

① φ : variational parameter

② R : convolutional recognition network. U-net과 비슷하게 설계

③ h_φ : appropriate shift-and-scale transformation

⑶ 수식화

① φ는 posterior에 대한 variational distribution q_φ에서 Kullback-Leiber divergence를 최소화함으로써 얻어짐

② L : objectvice function

○ Monte Carlo sampling을 통해 계산됨

○ 위와 같은 L을 ELBO(evidence lower bound)라고도 함

③ 파라미터를 업데이트할 때 Adam optimizer를 사용

4. step 3. prediction [목차]

⑴ 정의 : 학습한 모델을 가지고 특정 통계량을 예측하는 과정

⑵ 변수 정의

① χ : differnet quantity

②｛A₁, ···, A_K｝: arbitrarily defiend area

③ ν_k : spatial gene expression in a specific defined area A_k. 주어진 확률 분포에서 그 평균값과 관련

④ X_k : read count in a specific defined area A_k. 주어진 확률 분포에서 그 관측값과 관련

⑤ η_g : A₁, A₂에서 얻은 유전자의 differential gene expression

○ normalization과 log-transformation을 고려해야 함

⑶ 수식화

5. 프로그램 실행 (RTX3090 기준) [목차]

# 1. packages

conda install git
pip install --user git+https://github.com/ludvb/xfuse@master


# 2. datasets

curl -Lo section1.jpg https://www.spatialresearch.org/wp-content/uploads/2016/07/HE_layer1_BC.jpg
curl -Lo section2.jpg https://www.spatialresearch.org/wp-content/uploads/2016/07/HE_layer2_BC.jpg
curl -Lo section3.jpg https://www.spatialresearch.org/wp-content/uploads/2016/07/HE_layer3_BC.jpg
curl -Lo section4.jpg https://www.spatialresearch.org/wp-content/uploads/2016/07/HE_layer4_BC.jpg

curl -Lo section1.tsv https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer1_BC_count_matrix-1.tsv
curl -Lo section2.tsv https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer2_BC_count_matrix-1.tsv
curl -Lo section3.tsv https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer3_BC_count_matrix-1.tsv
curl -Lo section4.tsv https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer4_BC_count_matrix-1.tsv

curl -Lo section1-alignment.txt https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer1_BC_transformation.txt
curl -Lo section2-alignment.txt https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer2_BC_transformation.txt
curl -Lo section3-alignment.txt https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer3_BC_transformation.txt
curl -Lo section4-alignment.txt https://www.spatialresearch.org/wp-content/uploads/2016/07/Layer4_BC_transformation.txt


# 3. CUDA 설치
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html

xfuse convert st --counts section1.tsv --image section1.jpg --transformation-matrix section1-alignment.txt --scale 0.15 --save-path section1
xfuse convert st --counts section2.tsv --image section2.jpg --transformation-matrix section2-alignment.txt --scale 0.15 --save-path section2
xfuse convert st --counts section3.tsv --image section3.jpg --transformation-matrix section3-alignment.txt --scale 0.15 --save-path section3
xfuse convert st --counts section4.tsv --image section4.jpg --transformation-matrix section4-alignment.txt --scale 0.15 --save-path section4

# xfuse run my-config.toml --save-path my-run
xfuse run ./my-config.toml --save-path ./my-run

입력: 2022.01.11 09:28

저작자표시 (새창열림)

'▶ 자연과학 > ▷ 생물정보학' 카테고리의 다른 글

【생물정보학】 유전자 스코어 라이브러리 (0)	2022.06.21
【생물정보학】 유전자 라이브러리 (0)	2022.06.02
【생물정보학】 RCTD의 이해 및 실행 (0)	2021.06.03
【생물정보학】 CellPhoneDB의 이해 (0)	2021.04.13
【생물정보학】 데이터 분석 : Kaplan-Meier 생존 곡선 (0)	2021.04.13

정빈이의 공부방

최근댓글

【생물정보학】 xFuse의 이해 및 실행

'▶ 자연과학 > ▷ 생물정보학' 카테고리의 다른 글

티스토리툴바

【생물정보학】 xFuse의 이해 및 실행

'▶ 자연과학 > ▷ 생물정보학' 카테고리의 다른 글

'▶ 자연과학/▷ 생물정보학' 관련 포스팅

티스토리툴바