我有一个看起来像这样的表(有更多的行):
Patient startday stopday drug
P1 2 4 D-A
P1 3 7 D-B
P1 9 13 D-C
P2 0 6 D-A
P2 2 10 D-C
P2 3 7 D-D
P2 8 12 D-B在我的matlab代码中,我需要检查在相同的开始/停止天数间隔内是否存在两种相同的药物。例如,我如何陈述患者P2中D-A和D-D的共同假设?换句话说:患者P2在同一天内(0-6和3-7,重叠在3-6上)同时服用了这两种药物,D-A和D-D。我的输出需要是一列1/0,说明这两种药物是否发生了重叠。(代码的最终目标,如果有用的话,是生存分析,这将是一个协变量)
我是一个初学者,我正在考虑的策略是:从一个独特的患者列表中,选择列表中的每个患者,然后选择药物1和药物2,看看开始/停止的天数是否重叠。对这些行写1,对所有其他行写0。
我在寻找各种帮助!
发布于 2020-05-17 21:43:36
这里有一种方法,我认为它能满足您的需要。
% data setup
Patient = categorical(["P1";"P1";"P1";"P2";"P2";"P2";"P2"]);
startday = [2;3;9;0;2;3;8];
stopday = [4;7;13;6;10;7;12];
drug = categorical(["D-A";"D-B";"D-C";"D-A";"D-C";"D-D";"D-B"]);
t = table(Patient, startday, stopday, drug);
% processing
unique_patients = unique(t.Patient);
unique_drugs = unique(t.drug);
% get each unique pair of drugs (ie. D-A & D-B, D-B & D-C ... D-C & D-D)
drug_pairs = nchoosek(unique_drugs, 2);
% results matrix to store the data in for X unique patients and Y pairs of drugs
results = zeros(length(unique_patients), length(drug_pairs));
for i = 1:length(unique_patients)
patient = unique_patients(i);
patient_data = t(t.Patient == patient,:);
patient_drugs = patient_data.drug;
for j = 1:length(drug_pairs)
drug_pair = drug_pairs(j,:);
% find the data for this patient with each of these drugs
drug1_data = patient_data(patient_drugs == drug_pair(1),:);
drug2_data = patient_data(patient_drugs == drug_pair(2),:);
% size(drugN_data, 1) will be 0 if this patient didn't have this drug
% so we check to make sure both drugs were administered
if size(drug1_data, 1) > 0 && size(drug2_data,1) > 0
% if drug 2 was stopped before drug 1 started or vice versa, there's no overlap
% so if this is not the case, there is an overlap and we store a 1 in results
if ~(drug1_data.stopday < drug2_data.startday || drug2_data.stopday < drug1_data.startday)
results(i,j) = 1;
end
end
end
end
% generate the names for the columns for each drug pair - can be edited as desired
drug_pair_column_names = [string(drug_pairs(:,1)) + " & " + string(drug_pairs(:,2))];
% construct the final table by joining:
% - first column, the patient names
% - rest of the columns, the results data we calculated above
results_table = [table(unique_patients,'VariableNames',["Patient"]), array2table(results, 'VariableNames', drug_pair_column_names)];结果:
results_table =
Patient D-A & D-B D-A & D-C D-A & D-D D-B & D-C D-B & D-D D-C & D-D
_______ _________ _________ _________ _________ _________ _________
P1 1 0 0 0 0 0
P2 0 1 1 1 0 1 https://stackoverflow.com/questions/61846476
复制相似问题